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PRODUCTION OF 3-HYDROXYPROPIONIC ACID IN RECOMBINANT 

ORGANISMS 

CROSS-REFERENCE TO RELATED APPLICATIONS 
This application claims priority from U.S. Provisional Patent Application S.N. 
5 60/151,440 filed August 30, 1999. 

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH 

OR DEVELOPMENT 
The research project which gave rise to the invention described in this patent 
application was supported by EPA grant R824726-01. The United States Government 
1 0 may have certain rights in this invention. 



The technology of genetic engineering allows the transfer of genetic tiraits 
between species and permits, in particular, the ttansfer of enzymes from one species to 
others. These techniques have first reached commercialization in connection with high- 
15 value added products such as pharmaceuticals. The techniques of genetic engineering 
are equally {q)plicable and cost effective when applied to genes and en2ymes which can 
be used to make basic chemical feedstocks. 

A metabolic patiiway of interest racists in the bacteria Klebsiella pneumoniae, 
vAdch has tiie ability to biologically produce 3 - hydroxypropionaldehyde from glycerol. 
20 Native microorganisms have tiie ability to produce 1 ,3 - propanediol from glycerol as 
well. Commercial interests are exploring tiie production of 1,3 - propanediol from 
glycerol or glucose, in recombinant organisms which have been engineered to express 
die en^es necessary for 1,3 - propanediol production from other organisms. 

3 - hydroxypropionic acid CAS registry Number [503-66-2] (abbreviated as 3- 
25 HP) is a three carbon non-chiral organic molecule. The lUPAC nomenclature name for 
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this molecule is propionic acid 3 - hydroxy. It is also known as 3 - hydroxypropionate, 
p - hydroxpropionic acid, p - hydroxypropionate, 3 - hydroxypropionic acid, 3 - 
hydroxypropanoate, hydracrylic acid, ethylene lactic acid, P -lactic acid and 2 - 
deoxygly eerie acid. Applications of 3 -HP include the manufacture of absorbable 

S prosthetic devices and surgical sutures, incorporation into beta-lactams, production of 
acrylic acid, formation of trifluromethylated alcohols or diols, polyhydroxyalkonates, 
and co-polymers with lactic acid. 3-HP for conmiercial use is now commonly produced 
by organic chemical syntheses. The 3-HP produced and sold by these methods is 
relatively expensive, and it would be cost prohibitive to use it for the production of 

10 monomers for polymer production. As discussed below, some organisms are known to 
produce 3-HP. However, there is not yet available a catalog of genes from these 
organisms and thus the ability to synthesize 3-HP using the enzymes natively 
responsible for the synthesis of that molecule in the native hosts which produce it does 
not now exist. 

1 S In addition to its conunercial utility, 3-HP it is found in a number of biological 

processes, notably including many naturally occurring bio-polymers. Poly(3 - 
hydroxybutyrate) (PHB) is the most abundant member of the microbial polyesters which 
contain hydroxy monomers termed polyhydroxyalkonates (PHAs). PHB has utility as a 
biodegradable thermoplastic material and the material was first produced industrially in 

20 1982. 

The majority of published research on PHA's that contain 3-HP has concentrated 
on two bacterial sources: Ralstonia eutropha CAlcaligenes eutrophus") and 
Pseudomonas oleovorans. Both Ralstonia eutropha and Pseudomonas oleovorans are 
able to grow on a nitrogen free media containing 3 - hydroxy - propionic acid, 1,5 - 

25 pentanediol or 1,7 - heptanediol. When 3-HP is the major hydroxy-acid added to the 
growth media, poly(3 - hydroxybutyrate - co - 3 - hydroxypropionic acid) is formed 
containing 7 mol % 3 - hydroxypropionic acid. These cells also store 3 mol %, 3 - 
hydroxypropionic acid poly(3 - butyrate - co - 3 - hydroxypropionic acid). 

Recombinant systems have been used to create PHAs. An £. coli strain 

30 engineered to express PHA synthase from either Ralstonia eutropha or Zoolgoea 
ramigera produced poly(3 - hydroxypropionic acid) when feed 1,3 - propanediol. 
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Skraly, F. A. "Polyhydroxyalkanoates Produced by Recombinant E. colV Poster at 
Engineering Foundation Conference: Metabolic Engineering IL 1998. An £. coli strain 
that expressed PHA synthase (MBX820), when provided with the genes encoding 
glycerol dehydratase and 1,3 - propanediol dehydratase from pneumonia , and 4 - 
5 hydroxybutyral- CoA transferase from Clostridium ttuyveri, synthesized PHB from 
glucose. 

Glycerol dehydratase, found in the bacterial pathway for the conversion of 
glycerol to 1,3 - propanediol, catalyzes the conversion of glycerol to 3 - 
hydroxypropionaldehyde and water. This enzyme has been found in a number of 

10 bacteria including strains of Citrobacter, Kiebsieiia, Laciobacilius, Entrobacter and 
Clostridium. In the 1,3 - propanediol pathway a second enzyme 1,3 - propanediol 
oxido-reductase (EC 1.1.202) reduces 3 - hydroxypropanaldehyde to 1,3 - 
propanediol in a NADH dependant reaction. The pathway for the conversion of 
glycerol to 1,3 - propanediol has been expressed in E. coli. Tong et al.. Applied and 

15 Environmental Microbiology 57 (12) 3541-3546. The genes responsible for the 
production of 1,3 - propanediol were cloned from the dha regulon of Klebsiella 
pneumoniae. Glycerol is transported into the cell by the glycerol facilitator, and then 
converted into 3 - hydroxy - propionaldehyde by a coenzyme B12- dependent 
dehydratase. E. coli lacks a native dha regulon, consequently £. coli cannot grow 

20 anaerobically on glycerol without an exogenous electron acceptor such as nitrate or 
fumarate. 

Aldehyde dehydrogenases are enzymes that catalyze the oxidation of aldehydes 
to carboxylic acids. The genes encoding non-specific aldehyde dehydrogenases have 
been identified in a wide variety of organisms e.g.; ALDH2 from Homo sapiens^ ALD4 
25 from Saccharomyces cerevisiae, and from E. coli both aldA and aldB, to name a few. 
These enzymes are classified by co-factor usage, most require either AND*, or NADP* 
and some will use either co-factor. The genes singled out for mention here are able to 
act on a number of different aldehydes and it likely that they may be able to oxidize 3 - 
hydroxy - propionaldehyde to 3 - hydroxypropionic acid. 
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BRIEF SUMMARY OF THE INVENTION 
The present invention is intended to permit the creation of a recombinant 
microbial host which is capable of synthesizing 3-HP from a starting material of 
glycerol or glucose. The glycerol or glucose is converted to 3 - 

5 hydroxypropionicaldehyde (abbreviated as 3-HP A) which is then converted to 3-HP. 
This process requires the so-called dhaB gene from Klebsiella pneumoniae which 
encodes the enzyme glycerol dehydratase any one of four different aldehyde 
dehydrogenase genes to convert 3-HPA to 3-HP. The four aldehyde dehydrogenase 
genes used were aldA from the bacterium E. coli, ALDH2 from humans, ALD4 from the 

1 0 yeast Saccharomyces cerevisiae, and aldB from £ colL The yeast gene appeared to give 
the best results. 

It is an object of the present invention to provide a genetic construct which 
encodes glycerol dehydratase and aldehyde dehydrogenase enzymes necessazy for the 
production of 3 - hydroxypropionic acid from glycerol. 
15 It is also an object of the present invention to provide a method for the 

production of 3 - hydroxypropionic acid from glycerol. 

Other features and advantages of the invention will be apparent fix)m the 
following description of the preferred embodiment thereof and from the claims. 

20 BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 
Not applicable. 

DETAILED DESCRIPTION OF THE INVENTION 
It is disclosed here that it is possible to introduce mto a bacterial host genes 
encoding two enzymes and thus confer upon that host the ability to produce 3-HP from 
25 glycerol. The two necessary enzymes are glycerol dehydratase and aldehyde 
dehydrogenase. It is here reported that the two enzymes are both necessary and 
sufficient to enable a strain of a suitable host, such as a competent E. coli strain, to make 
3-HP from glycerol. An exemplary gene encoding a glycerol dehydratase is known, the 
dhaB gene from Klebsiella pneumoniae^ sequenced and rendered convenient to use. 
30 Several exemplary aldehyde dehydrogenases are known, and their sequences are 
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presented here. From this information, it becomes practical to confer upon a bacterial 
host the ability to convert glycerol into 3-HP in a commercially reasonable manner. 

It was not apparent before the completion of the woilc described here that these 
two diverse enzymes could be produced in a common host to produce the ability to 

5 make 3-HP. There are many known aldehyde dehydrogenase enzymes and genes, and 
the enzymes are known to have varying substrate specificities and efficiencies. There 
was not evidence, prior to the work described here, that the aldehyde dehydrogenase 
enzyme would work on the 3-hydroxypropionicaldehyde (3-HPA) substrate to create 3- 
HP. Without that knowledge, there was no data firom which to predict the effectiveness 

10 of the 3-HF production studies described below. An additional uncertainty arises from 
the fact that the intermediate aldehyde, 3-HPA, is toxic to many bacterial host and thus 
the survival of the host is dependent upon the relative mtes of enzymatic production and 
conversion of the aldehyde intermediate to non-toxic 3-HP. 

A difficulty in the realization of the production of 3-HP desired here is that 

15 ribosome binding sites from non-native hosts are often ineffectual and lead to poor 
protein production and that many non-native promoters are often poorly transcribed and 
a bar to high protein expression. However, the inventors also recognized that a non- 
native promoter that is known to be very active and is inducible by the addition of a 
small molecule unrelated to the pathway being expressed is often a very efficient way to 

20 express and regulate the levels of enzymes expressed in hosts such as E. colu To 
achieve high levels of regulated gene expression plasmids were constructed which 
placed the expression of all exogenous genes necessary for the production of 3 - 
hydroxypropionic acid from glycerol imder the regulation of the trc promoter. The trc 
promoter, is efficient, not native to £. colU and inducible by the addition of IPTG. 

25 The present specification describes a genetic construct for use in the production 

of 3 - hydroxypropionic acid from glycerol The genetic construct includes exemplary 
DNA sequences coding for the expression of a glycerol dehydratase and a DNA 
sequence coding for aldehyde dehydrogenase. The set of exemplary sequences 
necessary for the expression of glycerol dehydratase is collectively referred to as 

30 "d^aB". The set of sequences necessary for the expression of aldehyde dehydrogenase 
includes any one of four different genes which proved efficacious. The individual 
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aldehyde dehydrogenase sequences referred to individually as ALDH4, ALD2, aldA and 
aldB. 

Producing 3 - hvdroxvpropionic acid in a foreign host 

In the work described below, the enzymes necessary for the production of 3 - 

5 hydroxypropionic acid from glycerol in £. coli were expressed under the regulation of 
the trc promoter, a non-native promoter inducible by the addition of IPTG. The 
glycerol dehydratase was encoded by the dhaB gene from Klebsiella pneumoniae, the 
aldehyde dehydrogenases used was any one of four different genes {ALDH2 from Homo 
sapiens J ALD4 from S. cerevisiae^ aldB from £. coli or aldA fix>m £. coli). Expression 

10 of these genes coding for glycerol dehydratase and any one of the genes encoding an 
aldehyde dehydrogenases was sufficient to enable the construct to produce 3 -HP ^en 
the fermentation media was supplemented with glycerol. In all of these constructs, the 
dhaB gene was downstream from the gene encodmg the aldehyde dehydrogenase used, 
and expression of both genes was regulated by the trc promoter. This order, however, is 

1 5 not required and the order of the gens on a construct and the use of multiple constructs is 
possible. 

In a mtnimal genetic construct made based on the data presented here, the only 
genetic elements present that would be necessary are the structural genes dhaB and an 
aldehyde dehydrogenase gene encoding a protein that efficiently catalyzes the oxidation 

20 of 3-hydroxypropionaldehyde to 3-hydroxypropionic acid, and non-native promoter 
sequences specifically selected to give the type of inducible control most appropriate for 
the context of the process in which the construct is to be used. Extraneous pieces of 
DNA, whether retained in the construct or added fiY>m other DNA sequences, would not 
necessarily be detrimental to effective 3-HP synthesis by the host organism, but would 

25 not be needed. Each sequence to be translated would necessarily be preceded by a 
ribosome binding site, functional in the selected host so that the messenger RNA(s) 
coding for the proteins of interest could be translated by ribosomes. Terminator 
sequences immediately downstream of each translated unit would also be necessary in 
some organisms, particularly in eukaryotes. The construct could be part of an 

30 autonomously replicating sequence, such as a plasmid or phage vector, or could be 
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integrated into the genome of the host. 

The structural genes and appropriate promoter(s) could be isolated by the use of 
restriction enzymes, by the polymerase chain reaction (PGR), by chemical synthesis of 
the appropriate oligonucleotides, or by other methods apparent to those skilled in the art 

5 or molecular biology. The promoter(s) would be derived from genomic DNA of other 
organism or from artificial genetic constructs containing promoters. Appropriate 
promoter fragments would be ligated into the construct upstream of the structural genes 
in any one of several possible arrangements. 

The aldehyde dehydrogenase expressed would have: high specific activity 

10 towards 3-hydroxypropionaidehyde; be very stable in the host it is expressed in; foe 
readily over expressed in the selected host; not be inhibited by either the substrates 
necessary for the reaction or the products formed by the reaction; be fiilly active under 
the fermentation conditions most favorable for the production of 3 - hydroxypropionic 
acid and be able to use either NAD^ or NADP"^. 

15 One possible arrangement is the true operon, where one promoter is used to 

direct transcription in one direction of all necessary Open Reading Frames (ORFs). The 
entire message is then contained in one messenger RNA. The advantages of the operon 
are that it is relatively easy to construct, since only one promoter is needed; that is it is 
relatively simple to replace the promoter with another promoter if that would be 

20 desirable later; and that it assures that the two genes are under the same regulation. The 
main disadvantage of the operon scheme is that the levels of the expression of the two 
genes cannot be varied independently. If it is found that the genes, for optimal 3 - 
hydroxypropionic acid synthesis, should be expressed at difTerent levels, the operon in 
most cases caimot be used to realize this. 

25 Another possible arrangement is the multiple-promoter scheme. Two or more 

promoters, with the same or distinct regulatory behavior, could be used to direct 
transcription of the genes. For example, one promoter could be used to direct 
transcription of dhoB and one to direct transcription of the gene encoding the 
^propriate aldehyde dehydrogenases. Because the genes theoretically can be 

30 transcribed and translated separately, a great number of combinations of multiple 

promoters is possible. Additionally, it would be most desirable to prevent the promoters 
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from inteifermg with one another. This could be achieved either by placing two 
promoters into the construct such that they direct transcription in opposite directions, or 
by inserting transcriptional tenninator sequences downstream of each separately 
transcribed unit. The main advantage of the multiple-promoter construct is that it 

S permits independent regulation of as many distinct units as desired, which could be 
important. The disadvantages are that it would be more difBcult to construct; more 
difficult to amend later; and more difiicult to efTectively regulate, since multiple 
changes in fermentation conditions would need to be introduced and might render the 
performance of the fermentation somewhat less predicable. 

10 In any construct, the promoter sequence(s) used should be functional in the 

selected host organism and preferably provide sufKcient transcription of the genes 
comprising the glycerol to 3 - hydroxypropionic acid pathway to enable the construct to 
be adequately active in that host. The promoter sequence(s) used would also effect 
regulation of transcription of the genes enabling the glycerol to 3-HP pathway to be 

15 adequately active under the fermentation conditions employed for 3-HP production, and 
preferably they would be inducible, such that expression of the genes could be 
modulated by the inclusion in, or exclusion from, the fermentation of a certain agents or 
conditions. 

A plausible example of the use of such a construct follows: one promoter, vAnch 
20 induced by the addition of an inexpensive chemical (the inducer) to the medium, could 
control transcription of both the dhaB gene and the gene encoding the appropriate 
aldehyde dehydrogenase. The cells would be permitted to grow in the absence of the 
mducer until they accumulated to a predetermined level. The inducer would then be 
added to the fermentation and nutritional changes commensurate with the altered 
25 metabolism would be made to the medium as well. The cells would then be permitted to 
utilize the substrate(s) provided for 3-HP production (and additional biomass production 
if desired). After the cells could no longer use substrate to produce 3-HP, the 
fermentation would be stopped and the 3-HP recovered. 



30 



Genetic Sequences 

To express glycerol dehydratase and a suitable aldehyde dehydrogenase, the two 
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enzymes necessary for the production of 3 - hydroxypropionic acid from glycerol, it is 
required that the DNA sequences containing the glycerol dehydratase and aldehyde 
dehydrogenase coding sequences be combined with at least a promoter sequence 
(preferably a non-native promoter although some native promoter activity may be 
5 present). An exemplary method of construction is described in the example below. To 
ensure that the present specification is enabling, the fiili sequences of the coding regions 
of genes for these enzymes is presented here. 

Sequences 1, 3, 5 and 7 present different native genomic sequences for genes 
encoding aldehyde dehydrogenases. 
1 0 SEQ ID NO: i contains the fidl native DNA sequence encoding the ALD4 

enzyme from Saccharomyces cerevisiae. The amino acid sequence of the protein is 
presented as SEQ ID N0:2. 

SEQ ID N0:3 includes the DNA sequence for the human ALDH2 gene, again 
including the full protein coding region. The amino acid sequence for this human 
15 alcohol dehydrogenase is presented in SEQ ID N0:4. 

SEQ ID N0:5 and 7 respectively present the full coding sequences from the E. 
coli genes aldA and aldB, both of which encode alcohol dehydrogenases. The amino 
acid sequences for the proteins encoded by the genes are presented in SEQ ID NO: 6 
and 8 respectively. 

20 SEQ ID N0:9 contains the native genomic DNA sequence for the dhaB gene 

from the dha regulon of Klebisiella pneumoniae. The coding sequences for this 
complex regulon produces five polypeptides, which are presented as SEQ ID NOS:10 
through 13, which together provide the activity of the glycerol dehydratase enzyme. 
Each of these coding sequences can be used to make genetic constructs for the 

25 expression of the appropriate enzymes in a heterologous hosts. In making genetic 

constructs for expression of the genes in such hosts, it is contemplated that heterologous 
promoters will be joined to the coding sequences for the enzymes, but all that it required 
is that the promoters be effective for the hosts in which the genes are to be expressed. It 
is also contemplated and envisioned that significant variations in DNA sequence are 

30 possible from the native DNA coding sequences presented here. As is well known in 
the art, due to the degeneracy of the genetic code, many different DNA sequences can 
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encode the expression of the same protein. So, when this document uses language 
specifying a DNA sequence encoding a protein, it is intended to encompass any DNA 
sequence which can be used to express that protein even if different fiom the genomic 
sequences presented here. It is also contemplated that conservative changes in the 

5 amino acid sequences of the proteins specified here can be made without departing from 
the present invention. In particular, deletions, additions and substitutions of one or more 
amino acids in a protein sequence can almost always be made without changing protein 
functionality. When the name of a protein is sued here, it is intended to be equally 
applicable to both such minor changes in amino acid sequence and to allelic variations 

10 in native protein sequence as occurs within the species named as well as other closely 
related species. 

It is possible that many of the above DNA sequences could be truncated and still 
express a protein that has the same enzymatic properties. One skilled in the art of 
molecular biology would appreciate that minor deletions, additions and mutations may 

1 S not change the attributes of the designated base pair sequences; many of the nucleotide 
of the designated base pair sequences are probably not essential for their unique 
function. To determine whether or not an altered sequence or sequences has sufficient 
homology with the designated base pairs to function identically, one would simply 
create the candidate mutation, deletion or alteration and create a gene construct 

20 including the altered sequence together with promoter and termination sequences. This 
gene construct could be tested as, described below, for the production of 3-HP from 
glycerol. 

Certain DNA primers were used to isolate or clone the genomic DNA sequences 
used in the experiments described below. While the sequence information presented 
25 here is sufficient to enable the construction of expression plasmids incorporating the 
genes identified here, in order to redundantly enable the use of these genes, primers 
which may be used to isolated the genes firom their native hosts are described below. 

The primers aldA^L (SEQ ID NO: 14), and aldA_R (SEQ ID NO: 15), were used 
to amplify the 1513 bp aWA fragment from genomic E, coli DNA (strain MG1655, a 
30 gift from the Genetic Stock Center, New Haven, CT), The gel purified PCR fragment 
containing a DNA sequence coding for the expression of aldehyde dehydrogenase was 
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inserted into Ncol-Xhol site of pSE380 Gnvitiogen, San Diego, CA) to give pPFS3. The 
resulting plasmid contained aldA under the control of the trc promoter. This construct 
allowed for high-level expression of the aldA gene from E. coli under regulation of the 
trc promoter. Unless indicated otherwise all molecular biology and plasmid 
5 constructions were done in £ coli AGl (Stratagene, La JoUa, CA). 

The primers aldB^L (SEQ ID NO:20) and aldB^R (SEQ ID NO:21), were used 
to amplify the 1574 bp aldR fragment from genomic £. coli DNA (strain MG1655). 
The resulting PCR converted the TGA stop codon into a TAA stop codon. The gel- 
purified PGR fiagment containing the DNA sequence sufficiently coding for the 

givepPFS12. 

The primers ALD4^L (SEQ ID NO : 16), and ALD4_R (SEQ ID NO : 17), were 
used to amplify the 1595 bp ALD4 firagment bom 5. cerevisiae DNA (strain YPH500). 
The gel-purified fragment containing a DNA sequence coding for the expression of 

15 aldehyde dehydrogenase was inserted into the Kpnl-Sacl site of pPFS3 to give pPFS8. 
The resulting plasmid contained mature ALD4 under control of the trc promoter. 

The primers ALDH2^L (SEQ ID N0:18), and ALDH2_R (SEQ ID N0:19), 
were used to amplify the 1541 bp ALDH2 fragment from pT7-7::ALDH2, a gift from 
H. Weiner (Purdue University, West Lafayette, IN). The gel purified PCR fragment 

20 containing a DNA sequence sufficiently homologous to base pairs 22 to 1 524, inclusive 
of SEQ ID NO : 3 so as to code for the expression of aldehyde dehydrogenase was 
inserted in to the KpnlSacl site of pSE380 to give pPFS7. This sequence was moved 
from pPFS7 into the KpnlSacl site of pPFS3 to give pPFS9, The resulting plasmid 
contained mature ALDH2 under the control of the trc promoter. 

25 The primers pTRC^L (SEQ ID NO:22), and pTRC_R )SEQ ID NO:23), were 

used to amplify the 540 bp fragment from pSE380. The gel purified PCR fragment was 
inserted into the HpaVKpnl site of pPFS3 to give pPFS13. The resulting plasmid 
deleted the "native" ribosome binding site of pSE380 and a Ncol site (which contained 
an extraneous ATG start codon upstream of the cloned genes). The I^nl-Sacl 

30 fragments of pPFS8, pPFS9, and pPSF12 were inserted into the Kpnl-Sacl site of 
pPFS13 to give pPFS14, pPFSIS, and pPFS16, respectively. 
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^ssav for production of 3-HP 

The efficacy of changes made as contemplated herein can be checked by the 
following tests. To test for the production of 3-HP, fermentation products can be 
quantified with a Waters Alliance Integrity HPLC system (Milford, MA) equipped with 

5 a refractive index detector, a photodiode array detector, and an Aminex HPX-87H (Bio- 
Rad, Hercules, CA) organic acids column. The mobile phase should be 0.01 N sulfuric 
acid solution (pH 2.0) at a flow rate of 0.5 mL/min. The column temperature should be 
set to 40^C. Compounds can be identified by determining if they co-elute with 
authentic standards. Prior to analysis, all samples should be filtered through OAS 

10 pore size membrane. (Gelman Sciences, Ann Arbor, MI). The firactions of the 

fermentation products collected using HPLC should be analyzed on a Varian Star 3400 
CX, gas - chromatograph coupled to a Varian Saturn 3 mass spectrometer (GC-MS) 
(Wahiut Creek, CA). 

Assay for enzyme activity. 

15 Aldehyde dehydrogenase activity can be determined by measuring the reduction 

of P-NAD^ at 25 ''C with 3 - hydroxypropionaldehyde as a substrate. All buffers should 
contain 1 mM ethylenediaminetetraacetic acid (EDTA), 0.1 mM Pefabloc SC 
(Boehringer Mannheim, Indianapolis, IN) and 1 mM Tris (carboxyethyl) phosphine 
hydrochloride (TCEP-HCL). For ALD4, the solution should contain 100 mM Tris HCL 

20 Buffer (pH 8.0), 100 mM KCl. For ALDH2 the solution should contained 100 mM 
sodium pyrophosphate (pH 9.0). For AldA and AldB, the solution should contain 20 
mM sodium glycine (pH 9.5). A total of 3.0 mL of buffer should be added to quartz 
cuvettes and allowed to equilibrate to assay temperature. From 5 to 20 nL of cell extract 
should be added and backgroxmd activity recorded after the addition of P-NAD^ to a 

25 final concentration of 0.67 mM. The reaction should be started by the addition of 
substrate (either acetaldehyde, propionaldehyde, or 3 - hydroxypropionaldehyde) to a 
final concentration of 2 mM. Assay mixtures should be stirred with micro-stirrers 
during the assays. 

For aldehyde dehydrogenase activity assays, one unit is defined as the reduction 
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of 1.0 of p-AND" per minute at 25 C. These reactions can be monitored by 
following the change in absorbence at 340 nm (A340) at 25 °C on a Varain Carry-l Bio 
spectrophotometer (Sugar Land, TX), Total protein concentrations in the ceil extracts 
can be determined using the Bradford assay method (Bio-Rad, Hercules, CA) with 
5 bovine serum albumin as the standard. 

EXAMPLES 

Pl9?inid constructions. 

Klebsiella pneumoniae expresses glycerol dehydratase, an enzyme that catalyzes 
the conversion of glycerol to 3 - hydroxypropionaldehyde, {dhaB) and 1,3 - 

1 0 propanediol oxidoreductase an enzyme that catalyzes the conversion of 3 - 

hydroxypropionaldehyde to 1,3 - propanediol respectively (the gene product from 
dhaJ). A plasmid encoding these two genes was created and expressed in E. coli 
(plasmid pTC53). The cftar gene was deleted from pTC53 to create pMH34. The 
resulting plasmid still contained the DNA sequence complementary to base pairs 330 to 

15 2153 inclusion of SEQ ID NO : 9, the complement of base pairs 2166 to 2591, 

inclusive, of SEQ ID NO : 9, and the complement of base pairs 3191 to 4858, inclusive, 
of SEQ ID NO : 9, so as to code for the expression of glycerol dehydratase. The 
fragment of DNA encoding these sequences was excised from pMH34 by cutting it with 
Sall'Xbal, and the resulting fragment was gel purified (the purified fiagment was gift 

20 fit)mM. Hoffman ofthe University of Wisconsin -Madison). This DNA firagmentwas 
inserted into the Sall-Xbal site of pPFS13 to give pPFSlT. 

The resulting plasmid contained both the aldA and dhaB genes under the control 
of the trc promoter. Similarity, the gel-purified Sall-Xbal fragment from pMH34 was 
inserted into the Sall-Xbal sites of pPFS14, pFFSlS, and pPFS16 to give pPFSlS, 

25 pPFS 1 9, and pPFS20, respectively. These plasmids contained ALD4, ALDH2, and 
aldBj respectively, as well as dhaB imder the control of the trc promoter; in all of the 
constructs the (UmB gene were downstream ofthe gene encoding the aldehyde 
dehydrogenase. 
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Expression in £ coli. 

The efficacy of E, coli as a platfoim for the production of 3-HP from growth on 
glucose has been examined using a mathematical model developed for this purpose. The 
model was executed in two different ways assuming the conversion of one mole of 

5 glucose under either anaerobic or aerobic conditions either directly to 3-HP or to the 
production of 3-HP and ATP. The optimum yield under anaerobic conditions is 1 mole 
of 3-HP and 1 mole of lactate. The more realistic yield under anaerobic conditions is 
0.S moles of 3-HP, l.S moles of lactate and 1 mole of ATP. The optimum yield under 
aerobic conditions is 1.9 moles of 3-HP and 0.3 moles of CO2. The realistic yield under 

10 aerobic conditions is 1.8S moles of 3-HP, 0.3S moles of CO2 and 1 mole of ATP. 
The effect of 3-HP concentration on £ coli stram MG16SS growth was 
measured. Cells were grown on standard media with and without the addition of up to 
80g/L of 3-HP. The best fit of these data demonstrated that 3-HP was only 1,4 times as 
inhibitory as lactic acid on the growth of E. coli. It is possible to economically produce 

15 lactic acid using £. coli, since 3-HP is only 1.4 times more inhibitory than lactic acid, 
it should be possible to use E. coli as a host for the commercial production of 3-HP. 

Media and growth conditions 

The standard media contained the following per liter: 6 g Na2HP04, 3 g KH2PO4, 
1 g NH4CI, 0.5 g NaCl, 3 mg CaClj, 5 g yeast extract (Difco Laboratories, Detroit, MI) 

20 and 2 mM MgS04. When necessary to retain plasmids ampicillin (100 mg/mL) was 
added to the media. Isopropyl-P-thiogalactopyranoside (IPTG) was added in varying 
amounts to induce gene expression. All fermentations were carried out in an incubator- 
shaker at 37 C and 200 rpm. Anaerobic fermentations were carried out in 500-mL 
anaerobic flasks with 300 mL of working volume. Inocula for fennentations were 

25 grown overnight in Luria-Bertani medium supplemented with ampicillin is necessary. 
The 300-mL fermentations were inoculated with 1 .5 mL of the overnight culture. For 
enzyme assays, fermentations were incubated for 24 hours. 

Over expression of aldehyde dehydrogenase in E. coli. 

Cells were harvested by centrifugation at 3000 x g for 10 minutes at 4''C with a 
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Beckman (Fullerton, CA) model J2-21 centrifuge. Cell pellets were washed twice in 
100 mM potassium phosphate buffer at pH 7.2 and re-suspended in appropriate assay 
resuspension buffer equal to 5 x of the volume of the wet cell mass. The cells were 
homogenized using a French pressure cell. The homogenate was centrifiiged at 40000 x 

5 g for 30 minutes. The supernatant was dialyzed against the appropriate resuspension 
buffer using 10000 molecular weight cut-off pleated dialysis tubing (Pierce, Rockford» 
IL) at 4°C. Dialysis buffer was changed after 2 hours, and 4 hours, and dialysis was 
stopped after being allowed to proceed overnight. 

E. coli AGl cells transfected with the plasmids constructed to express the cddA^ 

10 ALD4, ALDH2^ or aldB genes were grown in 500-mL anaerobic flasks. Twelve hours 
after the fermentations were inoculated IPTG was added to induce enzyme expression. 
The cells were allowed to grow for an additional 12 hours then harvested and lysed as 
discussed above. The soluble fraction of the lysate was assayed for aldehyde 
dehydrogenase activity using the substrate 3-hydroxypropionicaldehyde in the buffer 

IS appropriate for the particular enzyme expressed The plasmid, aldehyde dehydrogenase 
expressed and specific activity measured (U/mg of protein) were as follows: pFFSlB, 
aldA, 0.2; pPFS14,^Ii>^, 0.5, pPFSlS, ^Z)H2, 0.3; andpPFS16, aldB, 0.1. The 
control, £ coli strain AGl harboring plasmid pSE380, encoded no exogenous aldehyde 
dehydrogenase activity and it had no detectable acti\aty with 3-HP as substrate. It is 

20 clear from the activity assays that all four aldehyde dehydrogenases were expressed in 
E. coli. The aldehyde dehydrogenase cloned from Saccharomyces cerevisiae (ADH4) 
had the highest activity when 3-hydroxypropionaldehyde was used as the substrate (0.5 
units/mg of protein). 

£. coli cells transformed with plasmids expressing: aldehyde dehydrogenase; 

25 both aldehyde dehydrogenase and glycerol dehydratase, or neither gene; were grown 
and assayed for their ability to produce 3-HP firom glycerol. The cells were grown on 
standard media supplemented with 6 ^M of Coenzyme B12, under anaerobic conditions 
in the absence of light (to protect the integrity of the Coenzyme B12 necessaxy for DhaB 
activity). After 12 hours, IPTG was added to induce expression of the genes under the 

30 trc promoter at the same time 5g/L of glycerol was added. After 12 more hours of 

anaerobic fermentation the fermentation broth was assayed for 3 - HP by HPLC and GC, 
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the plasmid, aldehyde dehydrogenase gene expressed and g/Lof3- HP measured were 
as follows: pSF17, aldA, 0.031; p?Sn% ALD4, 0.173; and pPSF19, ALDH2, 0.061. 
Cells expressing dhoB but no exogenous aldehyde dehydrogenase genes (plasmid 
pMH34) produced 0,015 g/L of 3 - HP. Cells expressing aldA, ALD4, ALDH2 or aldB 
5 but not dhaB (plasmids pPFS13, pPFS14, pPFSlS, pPFS16, respectively) all produced 
less then 0.005 g/L of 3-HP when the media the cells were growing in was 
supplemented with 2.5g/L of 3-hydroxypropionaldehyde. 

Other Hosts and Pron^o^CTS 

Applications of the 3 * hydroxypropionic acid pathway such as the genetic 

1 0 constructs of the present invention can easily be expressed in other organisms. The 
required genes would need to be placed under control of an appropriate promoter or 
promoters. Some organism such as yeasts may require transcription terminators to be 
placed after each transcribed unit The knowledge of the present intention makes such 
amendments possible. Such a genetic construct would need to be part of a vector that 

15 could either replicate in the new host or integrate into the chromosome of the new host. 
Many such vectors are conmiercially available for expression in gram-negative and 
gram-positive bacteria, yeast, manmialian cells, insect cell, plant, etc. For example, to 
express the 3-hydroxypropionic acid pathway in Rhodobacter capstdatus, one could 
obtain vector pNH2 from the American Type Culture Collection ( ATTC). This is a 

20 shuttle vector for use in capsulatus and E. colL Organisms such as Saccharomyces 
cerevisiae which can convert glucose to glycerol could be used as a host, such a 
construct would enable the production of 3 - HP directly from glucose. Additionally, 
other substrates such as xylan could also be used given the selection of an appropriate 
host. 

25 Stochiometric analysis shows that best stochiometric yield of 3-HP production in 

£. colt calculated on the basis of glucose consumed is obtained under aerobic 
conditions. Under aerobic condition CO2 is the only carbon-containing co-product, in 
particular the generation of lactic acid which occurs under anaerobic conditions is 
avoided. Production of 3-HP under these conditions could result in a more economical 

30 recovery of 3-HP from the fermentation broth. 
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Alternatively, the dhaB gene and a gene encoding the appropriate aldehyde 
dehydrogenase could be cloned into the multiple cloning site of this vector in E. coli to 
facilitate construction, and then transformed into R. capsulatus. The K capsulatus nifH 
promoter, provided on the plasmid, could be used to direct the transcription in R, 
5 capsulatus of the genes placed into pNF2 in series with one promoter, or with two 
copies of the nifH promoter. Expression of the genes in other organisms would require 
a procedure analogous to that presented here. 

Alternative Aldehyde Dehydrogenases and Glycerol Dehvdratases 

Applications of the pathway for the production of 3-hydroxypropionic acid from 

10 glycerol can be made using other suitable aldehyde dehydrogenases. To be functional in 
this pathway an aldehyde dehydrogenase needs to be stable, readily expressed in the 
host of choice and have high enough activity towards 3-hydroxypropionaldehyde to 
enable it to make 3-HP« The knowledge of the present invention makes such 
amendments possible. A program of directed evolution could be undertaken to select 

15 for suitable aldehyde dehydrogenases or they could be recovered from native sources, 
the genes encoding these en2ymes in conjimction with a gene encoding an appropriate 
glycerol dehydratase activity, would then be made part of any of the constructs 
envisioned here to produce 3 - hydroxypropionic acid from glycerol. 

A similar program of enzyme improvement including for example directed 

20 evolution could be carried out using the dhaB gene from Klebsiella pneumoniae as a 
starting point to obtain other variants of glycerol dehydratase that are superior in 
efficiency and stability to the form used in this invention. Alternatively, enzymes which 
catalyzes the same reaction may be isolated from others organisms and used in place of 
the Klebsiella pneumoniae glycerol dehydratase. Such enzymes may be especially 

25 useful in alternative hosts wherein they may be more readily expressed, be more stable 
and more efficient under the fermentation conditions best suited to the growth of the 
construct and the production and recovery of 3-HP. 
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CLAIM OR CLAIMS 

I/WE CLAIM: 

1 . A method for producing 3-hydroxypropionic acid comprising the steps of 
providing in a fermenter a recombinant microorganism which expresses genes 
5 for non-native enzymes which are capable of catalyzing the production of 3- 
hydroxypropionic acid from glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

fermenting the microorganism under conditions which result in the accumulation 
10 of 3-hydn)xypropionic acid. 



2. A method for producing 3-hydroxypropionic acid comprising the steps of 
providing in a fermenter a recombinant microorganism which carries genetic 
constructions for the expression of a glycerol dehydratase and an aldehyde 
dehydrogenase which are capable of catalyzing the production of 3-hydroxypropionic 
1 S acid fiom glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

fermenting the microorganism under conditions which result in the accumulation 
of 3-hydroxypropionic acid. 
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3. A method for producing 3-hydroxypropionic acid comprising the steps of 
providing in a fermenter a recombinant microorganism which carries a genetic 
construct which expresses the dhaB gene from Klebsiella pneumoniae and a gene for an 
aldehyde dehydrogenase, which are capable of catalyzing the production of 3- 
5 hydroxypropionic acid from glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

feraienting the microorganism under conditions which result in the accumulation 
of 3-hydroxypropionic acid. 

10 4. The method of claim 3 wherein the gene for the aldehyde dehydrogenase is 

selected from the group consisting of ALDH4, ALD2, aldA and aldB. 

5. The method of claim 3 wherein the aldehyde dehydrogenase is selected from 
the group consisting of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6 and SEQ ID N0:8. 

6. A recombinant E, coli host comprising in its mheritable genetic materials 
1 5 foreign genes encoding a glycerol dehydratase and an aldehyde dehydrogenase, such 

that the host is capable of producing 3-hydroxypropionic acid from glycerol. 

7. A recombinant E. coli host comprising in its inheritable genetic materials the 
dhaB gene from Klebsiella pheumoniae and the ald4 gene from Saccharomycetes 
cervisiae, such that the host is capable of producing 3-hydroxypropionic from glycerol. 
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8. A bacterial host comprising in its inheritable genetic material a genetic 
construction encoding for the expression of a glycerol dehydratase enzyme and an 
aldehyde dehydrogenase enzyme, such that the bacterial host is capable of converting 
glycerol to 3-hydroxypropionic acid. 



Klebsiella pneumoniae. 

10. The bacterial host of claim 8 wherein the gene encoding the glycerol 
dehydratase is the dhaB gene from Klebsiella pneumoniae. 

. 11. The bacterial host of claim 8 wherein the aldehyde dehydrogenase is 
10 selected from the group consisting of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6 and 
SEQIDN0:8. 

12. The bacterial host of claim 8 wherein the gene for the aldehyde 
dehydrogenase is selected from the group consisting of ALDH4, ALD2, aldA and aldB. 



5 



9. The bacterial host of claim 8 wherein the glycerol dehydratase from 



-20- 



wo 01/16346 



PCTAJSOO/23878 



SEQUENCE LISTING 



<X10> Suthers, Patrick F 
Cameron, Douglas C. 

<120> Production of 3-Hydroxypropionic Acid in Recombinant Organisms 

5 <130> UW960296. 96617 

<140> 
<141> 

<160> 23 

<170> Patentin Ver. 2.1 

10 <210> 1 

<211> 1529 
<212> DNA 

<213> Saccharomyces cerevisiae 

<220> 
15 <221> CDS 

<222> (25) (1509) 

<400> 1 

gtcgcggtac caaggaggta teat atg tea cac ctt cct atg aca gtg cct 51 



ate aag ctg cec aat ggg ttg gaa tat gag caa cca acg ggg ttg ttc 
lie Lys Leu Pro Asn Gly Leu Glu Tyr Glu Gin Pro Thr Gly Leu Phe 



Met Ser His Leu Pro Met Thr Val Pro 



20 



1 



5 



10 



15 



20 



25 



ate aae aae aag ttt gtt cct tet aaa eag aac aag ace ttc gaa gte 
25 He Asn Asn Lys Phe Val Pro Ser Lys Gin Asn Lys Thr Phe Glu Val 

30 35 40 
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att aac cct tec acg gaa gaa gaa ata tgt cat att tat gaa ggt aga 195 
lie Asn Pro Ser Thr Glu 6lu Glu lie Cys His lie Tyr Olu Gly Arg 
45 50 55 



gag gac gat gtg gaa gag gcc gtg cag gcc gee gac cgt gcc ttc tct 243 
S Glu Asp Asp Val Glu Glu Ala Val Gin Ala Ala Asp Arg Ala Phe Ser 
60 65 70 



aat ggg tct tgg aac ggt ate gac cct att gac agg ggt aag get ttg 291 
Asn Gly Ser Trp Asn Gly lie Asp Pro He Asp Arg Gly Lys Ala Leu 
75 BO 85 

10 tac agg tta gcc gaa tta att gaa cag gac aag gat gtc att get tec 339 
Tyr Arg Leu Ala Glu Leu He Glu Gin Asp Lys Asp Val He Ala Ser 
90 95 100 105 

ate gag act ttg gat aac ggt aaa get ate tct tee teg aga gga gat 387 
He Glu Thr Leu Asp Asn Gly Lys Ala He Ser Ser Ser Arg Gly Asp 
IS 110 115 120 

gtt gat tta gtc ate aac tat ttg aaa tct tct get ggc ttt get gat 435 
Val Asp Leu Val He Asn Tyr Leu Lys Ser Ser Ala Gly Phe Ala Asp 
125 130 135 



aaa att gat ggt aga atg att gat act ggt aga ace cat ttt tct tac 483 
20 Lys He Asp Gly Arg Met He Asp Thr Gly Arg Thr His Phe Ser Tyr 
140 145 150 

act aag aga cag cct ttg ggt gtt tgt ggg cag att att cct tgg aat 531 
Thr Lys Arg Gin Pro Leu Gly Val Cys Gly Gin He He Pro Trp Asn 
155 160 165 



25 ttc cea ctg ttg atg tgg gcc tgg aag att gee cct get ttg gtc ace 579 
Phe Pro Leu Leu Met Trp Ala Trp Lys He Ala Pro Ala Leu Val Thr 
170 175 180 185 
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ggt aac acc gtc gtg ttg aag act gcc gaa tec acc cca ttg tec get 627 
Gly Asn Thr Val Val Leu Lys Thr Ala Glu Ser Thr Pro Leu Ser Ala 
190 195 200 

ttg tat gtg tot aaa tac ate cca cag gcg ggt att cca cet ggt gtg 675 
5 Leu Tyr Val Ser Lys Tyr lie Pro Gin Ala Gly lie Pro Pro Gly Val 
205 210 215 

ate aac att gta tec ggg ttt ggt aag att gtg gtt gag gcc att aca 723 
lie Asn He Val Ser Gly Phe Gly Lys He Val Val Glu Ala He Thr 
220 225 230 

10 aac cat cca aaa ate aaa aag gtt gee ttc aca ggg tec aeg get aeg 771 
Asn His Pro Lys He Lys Lys Val Ala Phe Thr Gly Ser Thr Ala Thr 
235 240 245 

ggt aga cac att tac cag tec gca gcc gea ggc ttg aaa aaa gtg act 819 
Gly Arg His He Tyr Gin Ser Ala Ala Ala Gly Leu Lys Lys Val Thr 
IS 250 255 260 265 

ttg gag ctg ggt ggt aaa tea cca aac att gtc ttc gcg gac gcc gag 867 
Leu Glu Leu Gly Gly Lys Ser Pro Asn He Val Phe Ala Asp Ala Glu 
270 275 280 

ttg aaa aaa gcc gtg caa aac att ate ctt ggt ate tac tac aat tct 915 
20 Leu Lys Lys Ala Val Gin Asn He He Leu Gly He Tyr Tyr Asn Ser 
285 290 295 

ggt gag gtc tgt tgt gcg ggt tea agg gtg tat gtt gaa gaa tct att 963 
Gly Glu Val Cys Cys Ala Gly Ser Arg Val Tyr Val Glu Glu Ser He 
300 305 310 

25 tac gac aaa ttc att gaa gag ttc aaa gcc get tct gaa tec ate aag 1011 
Tyr Asp Lys Phe He Glu Glu Phe Lys Ala Ala Ser Glu Ser He Lys 
315 320 325 
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gtg ggc gac cca ttc gat gaa tct act ttc caa ggt gca caa acc tct 1059 
Val Gly Asp Pro Phe Asp Qlu Ser Thr Phe Gin Gly Ala Gin Thr Ser 
330 335 340 345 

caa atg caa eta aac aaa ate ttg aaa tac gtt gac att ggt aag aat 1107 
S Gin Met Gin Leu Asn Lys lie Leu Lys Tyr Val Asp He Gly Lys Asn 
350 355 360 

gaa ggt get act ttg att acc ggt ggt gaa aga tta ggt age aag ggt 1155 
Glu Gly Ala Thr Leu He Thr Gly Gly Glu Arg Leu Gly Ser Lys Gly 
365 370 375 

10 tac ttc att aag cca act gtc ttt ggt gac gtt aag gaa gac atg aga 1203 
Tyr Phe He Lys Pro Thr Val Phe Gly Asp Val Lys Glu Asp Met Arg 
380 385 390 

att gtc aaa gag gaa ate ttt ggc cet gtt gtc act gta ace aaa ttc 1251 
He Val Lys Glu Glu He Phe Gly Pro Val Val Thr Val Thr Lys Phe 
IS 395 400 405 

aaa tct gcc gac gaa gtc att aac atg gcg aac gat tct gaa tac ggg 1299 
Lys Ser Ala Asp Glu Val He Asn Met Ala Asn Asp Ser Glu Tyr Gly 
410 415 420 425 

ttg get get ggt att cac acc tct aat att aat acc gcc tta aaa gtg 1347 
20 Leu Ala Ala Gly He His Thr Ser Asn He Asn Thr Ala Leu Lys Val 
430 435 440 

get gat aga gtt aat gcg ggt acg gtc tgg ata aac act tat aac gat 1395 
Ala Asp Arg Val Asn Ala Gly Thr Val Trp He Asn Thr Tyr Asn Asp 
445 450 455 

25 ttc cac cac gca gtt cet ttc ggt ggg ttc aat gca tct ggt ttg ggc 1443 
Phe His His Ala Val Pro Phe Gly Gly Phe Asn Ala Ser Gly Leu Gly 
460 465 470 
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agg gaa atg tct gtt gat get tta caa aac tac ttg caa gtt aaa gcg 1491 
Arg Glu Met Ser Val Asp Ala Leu Gin Asn Tyr Leu Gin Val Lys Ala 
475 480 485 

gtc cgt gcc aaa ttg gac gagtaagagc tcgaattcgc 1529 
5 Val Arg Ala Lys Leu Asp 
490 495 



<210> 2 
<211> 495 
<212> PRT 
10 <213> Saccharomyces cerevisiae 

<400> 2 

Met Ser His Leu Pro Met Thr Val Pro He Lys Leu Pro Asn Gly Leu 
15 10 15 

Glu Tyr Glu Gin Pro Thr Gly Leu Phe He Asn Asn Lys Phe Val Pro 
15 20 25 30 

Ser Lys Gin Asn Lys Thr Phe Glu Val He Asn Pro Ser Thr Glu Glu 
35 40 45 

Glu He Cys His He Tyr Glu Gly Arg Glu Asp Asp Val Glu Glu Ala 
50 55 60 

20 Val Gin Ala Ala Asp Arg Ala Phe Ser Asn Gly Ser Trp Asn Gly He 
65 70 75 80 

Asp Pro He Asp Arg Gly Lys Ala Leu Tyr Arg Leu Ala Glu Leu He 
85 90 95 

Glu Gin Asp Lys Asp Val He Ala Ser He Glu Thr Leu Asp Asn Gly 
25 100 105 110 



Lys Ala He Ser Ser Ser Arg Gly Asp Val Asp Leu Val He Asn Tyr 
115 120 125 




WO01/l(i346 PCTAJSOO/23878 



Leu Lys Ser Ser Ala Gly Phe Ala Asp Lys He Asp Gly Arg Met He 
130 135 140 

Asp Thr Gly Arg Thr His Phe Ser Tyr Thr Lys Arg Gin Pro Leu Gly 
145 150 155 160 

5 Val Cys Gly Gin He He Pro Trp Asn Phe Pro Leu Leu Met Trp Ala 
165 170 175 

Trp Lys He Ala Pro Ala Xjeu Val Thr Gly Asn Thr Val Val Leu Lys 
180 185 190 

Thr Ala Glu Ser Thr Pro Leu Ser Ala Leu Tyr Val Ser Lys Tyr He 
10 195 200 205 

Pro Gin Ala Gly He Pro Pro Gly Val He Asn He Val Ser Gly Phe 
210 215 220 

Gly Lys He Val Val Glu Ala He Thr Asn His Pro Lys He Lys Lys 
225 230 235 240 

IS Val Ala Phe Thr Gly Ser Thr Ala Thr Gly Arg His He Tyr Gin Ser 
245 250 255 

Ala Ala Ala Gly Leu Lys Lys Val Thr Leu Glu Leu Gly Gly Lys Ser 
260 265 270 

Pro Asn He Val Phe Ala Asp Ala Glu Leu Lys Lys Ala Val Gin Asn 
20 275 280 285 

He He Leu Gly He Tyr Tyr Asn Ser Gly Glu Val Cys Cys Ala Gly 
290 295 300 

Ser Arg Val Tyr Val Glu Glu Ser He Tyr Asp Lys Phe He Glu Glu 
305 310 315 320 

25 Phe Lys Ala Ala Ser Glu Ser He Lys Val Gly Asp Pro Phe Asp Glu 
325 330 335 
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Ser Thr Phe Gin 61y Ala Gin Thr Ser Gin Met Gin Leu Asn hya lie 
340 345 350 

Leu Lys Tyr Val Asp lie Gly Lys Asn Glu Gly Ala Thr Leu lie Thr 
355 360 365 

5 Gly Gly Glu Arg Leu Gly Ser Lys Gly Tyr Phe He Lys Pro Thr Val 
370 375 380 

Phe Gly Asp Val Lys Glu Asp Met Arg He Val Lys Glu Glu He Phe 
385 390 395 400 

Gly Pro Val Val Thr Val Thr Lys Phe Lys Ser Ala Asp Glu Val He 
10 405 410 415 

Asn Met Ala Asn Asp Ser Glu Tyr Gly Leu Ala Ala Gly He His Thr 
420 425 430 

Ser Asn He Asn Thr Ala Leu Lys Val Ala Asp Arg Val Asn Ala Gly 
435 440 445 

15 Thr Val Trp He Asn Thr Tyr Asn Asp Phe His His Ala Val Pro Phe 
450 455 460 

Gly Gly Phe Asn Ala Ser Gly Leu Gly Arg Glu Met Ser Val Asp Ala 
465 470 475 480 

Leu Gin Asn Tyr Leu Gin Val Lys Ala Val Arg Ala Lys Leu Asp 
20 485 490 495 



<210> 3 

<211> 1541 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> CDS 

<222> (22) . . (1521) 



<400> 3 

5 gcggtaccaa ggagatatca t atg tea gcc gcc gcc acc cag gcc gtg cct 51 

Met Ser Ala Ala Ala Thr Gin Ala Val Pro 
15 10 

gcc ccc aac cag cag ccc gag gtc ttc tgc aac cag att ttc ata aac 99 
Ala Pro Asn Gin Gin Pro Glu Val Phe Cys Asn Gin lie Phe lie Asn 
10 15 20 25 

aat gaa tgg cac gat gcc gtc age agg aaa aca ttc ccc acc gtc aat 147 
Asn Glu Trp His Asp Ala Val Ser Arg Lys Thr Phe Pro Thr Val Asn 
30 35 40 

ccg tec act gga gag gtc ate tgt cag gta get gaa ggg gac aag gaa 195 
IS Pro Ser Thr Gly Glu Val lie Cys Gin Val Ala Glu Gly Asp Lys Glu 
45 50 55 

gat gtg gac aag gea cgt gaa ggc cgc ccg ggc gcc ttc cag ctg ggc 243 
Asp Val Asp Lys Ala Arg Glu Gly Arg Pro Gly Ala Phe Gin Leu Gly 
60 65 70 

20 tea cct tgg cgc cgc atg gac gea tea cac age ggc egg ctg ctg aac 291 
Ser Pro Trp Arg Arg Met Asp Ala Ser His Ser Gly Arg Leu Leu Asn 
75 80 85 90 

cgc ctg gcc gat ctg ate gag egg gac egg acc tac ctg gcg gee ttg 339 
Arg Leu Ala Asp Leu lie Glu Arg Asp Arg Thr Tyr Leu Ala Ala Leu 
25 95 100 105 

gag acc ctg gac aat ggc aag ccc tat gtc ate tec tac ctg gtg gat 387 
Glu Thr Leu Asp Asn Gly Lys Pro Tyr Val lie Ser Tyr Leu Val Asp 
110 115 120 
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ttg gac atg gtc etc aaa tgt etc egg tat tat gee gge tgg get gat 435 
Leu Asp Met Val Leu Lys Cys Leu Arg Tyr Tyr Ala Gly Trp Ala Asp 
125 130 135 

aag tac cac ggg aaa acc ate ccc att gac gga gac ttc ttc age tac 483 
5 Lys Tyr His Gly Lys Thr lie Pro lie Asp Gly Asp Phe Phe Ser Tyr 
140 145 150 

aca egc eat gaa cct gtg ggg gtg tge ggg cag ate att ccg tgg aat 531 
Thr Arg His Glu Pro Val Gly Val Cys Gly Gin lie He Pro Trp Asn 
155 160 165 170 

10 ttc ccg etc etg atg caa gea tgg aag ctg ggc cea gee ttg gea act 579 
Phe Pro Leu Leu Met Gin Ala Trp Lys Leu Gly Pro Ala Leu Ala Thr 
175 180 185 

gga aac gtg gtt gtg atg aag gta get gag cag aca ccc etc acc gee 627 
Gly Asn Val Val Val Met Lys Val Ala Glu Gin Thr Pro Leu Thr Ala 
15 190 195 200 

etc tat gtg gee aac ctg ate aag gag get ggc ttt ccc cct ggt gtg 675 
Leu Tyr Val Ala Asn Leu He Lys Glu Ala Gly Phe Pro Pro Gly Val 
205 210 215 

gtc aac att gtg cct gga ttt ggc ccc aeg get ggg gee gee att gcc 723 
20 Val Asn He Val Pro Gly Phe Gly Pro Thr Ala Gly Ala Ala He Ala 
220 225 230 

tec cat gag gat gtg gac aaa gtg gea ttc aca ggc tec act gag att 771 
Ser His Glu Asp Val Asp Lys Val Ala Phe Thr Gly Ser Thr Glu He 
235 240 245 250 



25 ggc egc gta ate cag gtt get get ggg age age aac etc aag aga gtg 
Gly Arg Val He Gin Val Ala Ala Gly Ser Ser Asn Leu Lys Arg Val 
255 260 265 
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acc ttg gag ctg ggg ggg aag age ccc aac ate ate atg tea gat gcc 867 
Thr Leu Glu Leu Gly Gly Lys Ser Pro Asn He He Met Ser Asp Ala 
270 275 280 



gat atg gat tgg gcc gtg gaa cag gcc cac ttc gcc ctg ttc ttc aac 915 
S Asp Met Asp Trp Ala Val Glu Gin Ala His Phe Ala Leu Phe Phe Asn 
285 290 295 

cag ggc cag tgc tgc tgt gcc ggc tec egg ace ttc gtg eag gag gac 963 
Gin Gly Gin Cys Cys Cys Ala Gly Ser Arg Thr Phe Val Gin Glu Asp 
300 305 310 

10 ate tat gat gag ttt gtg gtg egg age gtt gee egg gee aag tet egg 1011 
He Tyr Asp Glu Phe Val Val Arg Ser Val Ala Arg Ala Lys Ser Arg 
315 320 325 330 

gtg gtc ggg aac ccc ttt gat age aag acc gag cag ggg ccg cag gtg 1059 
Val Val Gly Asn Pro Phe Asp Ser Lys Thr Glu Gin Gly Pro Gin Val 
IS 335 340 345 

gat gaa act eag ttt aag aag ate etc ggc tae ate aae aeg ggg aag 1107 
Asp Glu Thr Gin Phe Lys Lys He Leu Gly Tyr He Asn Thr Gly Lys 
350 355 360 

caa gag ggg gcg aag ctg ctg tgt ggt ggg ggc att get get gac egt 1155 
20 Gin Glu Gly Ala Lys Leu Leu Cys Gly Gly Gly He Ala Ala Asp Arg 
365 370 375 

ggt tac ttc ate cag ccc act gtg ttt gga gat gtg eag gat ggc atg 1203 
Gly Tyr Phe He Gin Pro Thr Val Phe Gly Asp Val Gin Asp Gly Met 
380 385 390 

25 ace ate gee aag gag gag ate ttc ggg eca gtg atg eag ate etg aag 1251 
Thr He Ala Lys Glu Glu He Phe Gly Pro Val Met Gin He Leu Lys 
395 400 405 410 
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ttc aag acc ata gag gag gtt gtt ggg aga gcc aac aat tec acg tac 1299 

Phe Lys Thr lie 6lu Glu Val Val Gly Arg Ala Asn Asn Ser Thr Tyr 
415 420 425 



ggg ctg gcc gca get gtc ttc aca aag gat ttg gac aag gcc aat tac 1347 
5 Gly Leu Ala Ala Ala Val Phe Thr Lys Asp Leu Asp Lys Ala Asn Tyr 
430 435 440 

ctg tec cag gcc etc cag gcg ggc act gtg tgg gtc aac tgc tat gat 1395 
Leu Ser Gin Ala Leu Gin Ala Gly Thr Val Trp Val Asn Cys Tyr Asp 
445 450 455 

10 gtg ttt gga gcc cag tea ccc ttt ggt ggc tac aag atg teg ggg agt 1443 
Val Phe Gly Ala Gin Ser Pro Phe Gly Gly Tyr Lys Met Ser Gly Ser 
460 465 470 

ggc egg gag ttg ggc gag tac ggg ctg cag gca tac act gaa gtg aaa 1491 
Gly Arg Glu Leu Gly Glu Tyr Gly Leu Gin Ala Tyr Thr Glu Val Lys 
IS 475 480 485 490 

act gtc aca gtc aaa gtg cet cag aag aac tcataagage tegaattcgc 1541 
Thr Val Thr Val Lys Val Pro Gin Lys Asn 
495 500 



<210> 4 

20 <211> 500 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Ser Ala Ala Ala Thr Gin Ala Val Pro Ala Pro Asn Gin Gin Pro 
25 1 5 10 15 



Glu Val Phe Cys Asn Gin lie Phe lie Asn Asn Glu Trp His Asp Ala 
20 25 30 
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Val Ser Arg Lys Thr Phe Pro Thr Val Asn Pro Ser Thr Gly Glu Val 
35 40 45 

lie Cys Gin Val Ala Glu Gly Asp Lys Glu Asp Val Asp Lys Ala Arg 
50 55 60 

5 Glu Gly Arg Pro Gly Ala Phe Gin Leu Gly Ser Pro Trp Arg Arg Met 
65 70 75 80 

Asp Ala Ser His Ser Gly Arg Leu lieu Asn Arg Leu Ala Asp Leu lie 
85 90 95 

Glu Arg Asp Arg Thr Tyr Leu Ala Ala Leu Glu Thr Leu Asp Asn Gly 
10 100 105 110 

Lys Pro Tyr Val lie Ser Tyr Leu Val Asp Leu Asp Met Val Leu Lys 
115 120 125 

Cys Leu Arg Tyr Tyr Ala Gly Trp Ala Asp Lys Tyr His Gly Lys Thr 
130 135 140 

15 lie Pro lie Asp Gly Asp Phe Phe Ser Tyr Thr Arg His Glu Pro Val 
145 150 155 160 

Gly Val Cys Gly Gin He He Pro Trp Asn Phe Pro Leu Leu Net Gin 
165 170 175 

Ala Trp Lys Leu Gly Pro Ala Leu Ala Thr Gly Asn Val Val Val Met 
20 180 185 190 

Lys Val Ala Glu Gin Thr Pro Leu Thr Ala Leu Tyr Val Ala Asn Leu 
195 200 205 

He Lys Glu Ala Gly Phe Pro Pro Gly Val Val Asn He Val Pro Gly 
210 215 220 
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Phe Gly Pro Thr Ala Gly Ala Ala He Ala Ser His Glu Asp Val Asp 
225 230 235 240 

Lys Val Ala Phe Thr Gly Ser Thr Glu He Gly Arg Val He Gin Val 
245 250 255 

5 Ala Ala Gly Ser Ser Asn Leu Lys Arg Val Thr Leu Glu Leu Gly Gly 
260 265 270 

Lys Ser Pro Asn He He Met Ser Asp Ala Asp Met Asp Trp Ala Val 
275 280 285 

Glu Gin Ala His Phe Ala Leu Phe Phe Asn Gin Gly Gin Cys Cys Cys 
10 290 295 300 

Ala Gly Ser Arg Thr Phe Val Gin Glu Asp He Tyr Asp Glu Phe Val 
305 310 315 320 

Val Arg Ser Val Ala Arg Ala Lys Ser Arg Val Val Gly Asn Pro Phe 
325 330 335 

15 Asp Ser Lys Thr Glu Gin Gly Pro Gin Val Asp Glu Thr Gin Phe Lys 
340 345 350 

Lys He Leu Gly Tyr He Asn Thr Gly Lys Gin Glu Gly Ala Lys Leu 
355 360 365 

heu Cys Gly Gly Gly He Ala Ala Asp Arg Gly Tyr Phe He Gin Pro 
20 370 375 380 

Thr Val Phe Gly Asp Val Gin Asp Gly Met Thr He Ala Lys Glu Glu 
385 390 395 400 

He Phe Gly Pro Val Met Gin He Leu Lys Phe Lys Thr He Glu Glu 
405 410 415 
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Val Val Gly Arg Ala Asn Asn Ser Thr Tyr Qly Leu Ala Ala Ala Val 
420 425 430 

Phe Thr Lys Asp Leu Asp Lys Ala Asn Tyr Leu Ser Gin Ala Leu Gin 
435 440 445 

5 Ala Gly Thr Val Trp Val Asn Cys Tyr Asp Val Phe Gly Ala Gin Ser 
450 455 460 

Pro Phe Gly Gly Tyr Lys Met Ser Gly Ser Gly Arg Glu Leu Gly Glu 
465 470 475 480 

Tyr Gly Leu Gin Ala Tyr Thr Glu Val Lys Thr Val Thr Val Lys Val 
10 485 490 495 

Pro Gin Lys Asn 
500 



<210> 5 
<211> 1512 
IS <212> DMA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (37).. (1473) 

20 <400> 5 

gctaccatgg cttaaccggt accaaggaga tatcat atg tea gta ccc gtt caa 54 

Met Ser Val Pro Val Gin 
1 5 

cat cct atg tat ate gat gga cag ttt gtt acc tgg cgt gga gac gca 102 
25 His Pro Met Tyr lie Asp Gly Gin Phe Val Thr Trp Arg Gly Asp Ala 
10 15 20 
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tgg att gat gtg gta aac cct get aca gag get gtc att tec cgc at a 150 
Trp lie Asp Val Val Asn Pro Ala Thr Glu Ala Val lie Ser Arg lie 
25 30 35 

ccc gat ggt cag gcc gag gat gcc cgt aag gca ate gat gea gea gaa 198 
5 Pro Asp Gly Gin Ala Glu Asp Ala Arg Lys Ala lie Asp Ala Ala Glu 
40 45 50 

egt gea eaa cea gaa tgg gaa geg ttg ect get att gaa egc gcc agt 246 
Arg Ala Gin Pro Glu Trp Glu Ala Leu Pro Ala He Glu Arg Ala Ser 
55 60 65 70 

10 tgg ttg cgc aaa ate tee gee ggg ate ege gaa egc gee agt gaa ate 294 
Trp Leu Arg Lys He Ser Ala Gly lie Arg Glu Arg Ala Ser Glu He 
75 80 85 

agt geg ctg att gtt gaa gaa ggg ggc aag ate cag eag ctg get gaa 342 
Ser Ala Leu He Val Glu Glu Gly Gly Lys He Gin Gin Leu Ala Glu 
15 90 95 100 

gtc gaa gtg get ttt act gee gae tat ate gat tae atg geg gag tgg 390 
Val Glu Val Ala Phe Thr Ala Asp Tyr He Asp Tyr Met Ala Glu Trp 
105 110 115 

gea egg cgt tae gag gge gag att att eaa age gat egt cea gga gaa 438 
20 Ala Arg Arg Tyr Glu Gly Glu He He Gin Ser Asp Arg Pro Gly Glu 
120 125 130 

aat att ctt ttg ttt aaa cgt geg ctt ggt gtg act acc ggc att ctg 486 
Asn He Leu Leu Phe Lys Arg Ala Leu Gly Val Thr Thr Gly He Leu 
135 140 145 150 

25 eeg tgg aac tte ccg ttc ttc etc att gcc cgc aaa atg get ccc get 534 
Pro Trp Asn Phe Pro Phe Phe Leu He Ala Arg Lys Met Ala Pro Ala 
155 160 165 
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ctt ttg acc ggt aat acc ate gtc att aaa cct agt gaa ttt acg aca 582 
Leu Leu Thr Gly Asn Thr lie Val lie Lys Pro Ser 61u Phe Thr Thr 
170 175 180 

aac aat gcg att gca ttc gcc aaa ate gte gat gaa ata ggc ctt ccg 630 
5 Asn Asn Ala lie Ala Phe Ala Lys lie Val Asp Glu lie Gly Leu Pro 
185 190 195 

cgc ggc gtg ttt aac ctt gta ctg ggg cgt ggt gaa acc gtt ggg caa 678 
Arg Gly Val Phe Asn Leu Val Leu Gly Arg Gly Glu Thr Val Gly Gin 
200 205 210 

10 gaa ctg gcg ggt aac cca aag gtc gca atg gtc agt atg aca ggc age 726 
Glu Leu Ala Gly Asn Pro Lys Val Ala Met Val Ser Met Thr Gly Ser 
215 220 225 230 

gtc tct gca ggt gag aag ate atg gcg act gcg gcg aaa aac ate acc 774 
Val Ser Ala Gly Glu Lys He Met Ala Thr Ala Ala Lys Asn He Thr 
IS 235 240 245 

aaa gtg tgt ctg gaa ttg ggg ggt aaa gca cca get ate gta atg gae 822 
Lys Val Cys Leu Glu Leu Gly Gly Lys Ala Pro Ala He Val Met Asp 
250 255 260 

gat gee gat ctt gaa ctg gca gte aaa gcc ate gtt gat tea cgc gtc 870 
20 Asp Ala Asp Leu Glu Leu Ala Val Lys Ala He Val Asp Ser Arg Val 
265 270 275 

att aat agt ggg caa gtg tgt aac tgt gca gaa cgt gtt tat gta eag 918 
He Asn Ser Gly Gin Val Cys Asn Cys Ala Glu Arg Val Tyr Val Gin 
280 285 290 

25 aaa ggc att tat gat eag ttc gtc aat egg ctg ggt gaa gcg atg eag 966 
Lys Gly He Tyr Asp Gin Phe Val Asn Arg Leu Gly Glu Ala Met Gin 
295 300 305 310 
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gcg gtt caa ttt ggt aac ccc get gaa cgc aac gac att gcg atg ggg 1014 
Ala Val Gin Phe Gly Asn Pro Ala Glu Arg Asn Asp He Ala Met Gly 
315 320 325 

ccg ttg att aac gcc gcg gcg ctg gaa agg gtc gag caa aaa gtg gcg 1062 
5 Pro Leu He Asn Ala Ala Ala Leu Glu Arg Val Glu Gin Lys Val Ala 
330 335 340 

cgc gca gta gaa gaa ggg gcg aga gtg gcg ttc ggt ggc aaa gcg gta 1110 
Arg Ala Val Glu Glu Gly Ala Arg Val Ala Phe Gly Gly Lys Ala Val 
345 350 355 

10 gag ggg aaa gga tat tat tat ccg ccg aca ttg ctg ctg gat gtt cgc 1158 
Glu Gly Lys Gly Tyr Tyr Tyr Pro Pro Thr Leu Leu Leu Asp Val Arg 
360 365 370 

cag gaa atg teg att atg cat gag gaa acc ttt ggc ccg gtg ctg cca 1206 
Gin Glu Met Ser He Met His Glu Glu Thr Phe Gly Pro Val Leu Pro 
IS 375 380 385 390 

gtt gtc gca ttt gac acg ctg gaa gat get ate tea atg get aat gac 1254 
Val Val Ala Phe Asp Thr Leu Glu Asp Ala He Ser Met Ala Asn Asp 
395 400 405 

agt gat tac ggc ctg acc tea tea ate tat acc caa aat ctg aac gtc 1302 
20 Ser Asp Tyr Gly Leu Thr Ser Ser He Tyr Thr Gin Asn Leu Asn Val 
410 415 420 

gcg atg aaa gee att aaa ggg ctg aag ttt ggt gaa act tac ate aac 1350 
Ala Met Lys Ala He Lys Gly Leu Lys Phe Gly Glu Thr Tyr He Asn 
425 430 435 

25 cgt gaa aac ttc gaa get atg caa ggc ttc cac gee gga tgg egt aaa 1398 
Arg Glu Asn Phe Glu Ala Met Gin Gly Phe His Ala Gly Trp Arg Lys 
440 445 450 
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tec ggt att ggc ggc gca gat ggt aaa cat ggc ttg cat gga tat ctg 1446 
Ser Gly lie Gly Gly Ala Asp Gly Lys His Gly Leu His Gly Tyr Leu 
455 460 465 470 

cag acc cag gtg gtt tat tta cag tct taagagctcg aattcccgtc 1493 
5 Gin Thr Gin Val Val Tyr Leu Gin Ser 
475 

gacggctcta gactcgagcg 1513 



<210> 6 
<211> 479 
10 <212> PRT 

<213> Escherichia coli 

<400> 6 

Met Ser Val Pro Val Gin His Pro Met Tyr He Asp Gly Gin Phe Val 
15 10 15 

IS Thr Trp Arg Gly Asp Ala Trp He Asp Val Val Asn Pro Ala Thr Glu 
20 25 30 

Ala Val lie Ser Arg lie Pro Asp Gly Gin Ala Glu Asp Ala Arg Lys 
35 40 . 45 

Ala lie Asp Ala Ala Glu Arg Ala Gin Pro Glu Trp Glu Ala Leu Pro 
20 50 55 60 

Ala He Glu Arg Ala Ser Trp Leu Arg Lys He Ser Ala Gly He Arg 
65 70 75 80 

Glu Arg Ala Ser Glu He Ser Ala Leu He Val Glu Glu Gly Gly Lys 
85 90 95 

25 He Gin Gin Leu Ala Glu Val Glu Val Ala Phe Thr Ala Asp Tyr He 
100 105 110 
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Asp Tyx Met Ala 61u Trp Ala Arg Arg Tyx Glu Qly 61u lie lie Gin 
115 120 125 

Ser Asp Arg Pro Gly Glu Asn lie Leu Leu Phe Lys Arg Ala Leu Gly 
130 135 140 

5 Val Thr Thr Gly lie Leu Pro Trp Asn Phe Pro Phe Phe Leu lie Ala 
145 150 155 160 

Arg Lys Met Ala Pro Ala Leu Leu Thr Gly Asn Thr lie Val lie Lys 
165 170 175 

Pro Ser Glu Phe Thr Thr Asn Asn Ala lie Ala Phe Ala Lys lie Val 
10 180 185 190 

Asp Glu He Gly Leu Pro Arg Gly Val Phe Asn Leu Val Leu Gly Arg 
X95 200 205 

Gly Glu Thr Val Gly Gin Glu Leu Ala Gly Asn Pro Lys Val Ala Met 
210 215 220 

IS Val Ser Met Thr Gly Ser Val Ser Ala Gly Glu Lys He Met Ala Thr 
225 230 235 240 

Ala Ala Lys Asn He Thr Lys Val Cys Leu Glu Leu Gly Gly Lys Ala 
245 250 255 

Pro Ala He Val Met Asp Asp Ala Asp Leu Glu Leu Ala Val Lys Ala 
20 260 265 270 

He Val Asp Ser Arg Val He Asn Ser Gly Gin Val Cys Asn Cys Ala 
275 280 285 

Glu Arg Val Tyr Val Gin Lys Gly He Tyr Asp Gin Phe Val Asn Arg 
290 295 300 
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Leu 6ly Glu Ala Met Gin Ala Val Gin Phe Gly Asn Pro Ala Glu Arg 
305 310 315 320 

Asn Asp lie Ala Met Gly Pro Leu lie Asn Ala Ala Ala Leu Glu Arg 
325 330 335 

S Val Glu Gin Lys Val Ala Arg Ala Val Glu Glu Gly Ala Arg Val Ala 
340 345 350 

Phe Gly Gly Lys Ala Val Glu Gly Lys Gly Tyr Tyr Tyr Pro Pro Thr 
355 360 365 

Leu Leu Leu Asp Val Arg Gin Glu Met Ser lie Net His Glu Glu Thr 
10 370 375 380 

Phe Gly Pro Val Leu Pro Val Val Ala Phe Asp Thr Leu Glu Asp Ala 
385 390 395 400 

lie Ser Met Ala Asn Asp Ser Asp Tyr Gly Leu Thr Ser Ser lie Tyr 
405 410 415 

15 Thr Gin Asn Leu Asn Val Ala Met Lys Ala lie Lys Gly Leu Lys Phe 
420 425 430 

Gly Glu Thr Tyr lie Asn Arg Glu Asn Phe Glu Ala Met Gin Gly Phe 
435 440 445 

His Ala Gly Trp Arg Lys Ser Gly lie Gly Gly Ala Asp Gly Lys His 
20 450 455 460 

Gly Leu His Gly Tyr Leu Gin Thr Gin Val Val Tyr Leu Gin Ser 
465 470 475 
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<210> 7 

<211> 1574 

<212> DMA 

<213> Escherichia coli 

5 <220> 

<221> CDS 

<222> (22) . . (1557) 

<400> 7 

gcggtaccaa ggaggtatca t atg acc aat aat ccc cct tea gca cag att 51 
10 Met Thr Asn Asn Pro Pro Ser Ala Gin He 

15 10 



99 



147 



aag ccc ggc gag tat ggt ttc ccc etc aag tta aaa gcc cgc tat gac 
Lys Pro Gly Glu Tyr Gly Phe Pro Leu Lys Leu Lys Ala Arg Tyr Asp 
15 20 25 

15 aac ttt att ggc ggc gaa tgg gta gcc cct gcc gac ggc gag tat tac 
Asn Phe He Gly Gly Glu Trp Val Ala Pro Ala Asp Gly Glu Tyr Tyr 
30 35 40 



cag aat ctg acg ccg gtg acc ggg cag ctg ctg tgc gaa gtg gcg tct 195 
Gin Asn Leu Thr Pro Val Thr Gly Gin Leu Leu Cys Glu Val Ala Ser 
20 45 50 55 

teg ggc aaa cga gac ate gat ctg gcg ctg gat get gcg cae aaa gtg 243 
Ser Gly Lys Arg Asp He Asp Leu Ala Leu Asp Ala Ala His Lys Val 
60 65 70 

aaa gat aaa tgg gcg cac acc teg gtg cag gat cgt gcg gcg att ctg 291 
25 Lys Asp Lys Trp Ala His Thr Ser Val Gin Asp Arg Ala Ala He Leu 
75 80 85 90 

ttt aag att gcc gat cga atg gaa caa aac etc gag ctg tta gcg aca 339 
Phe Lys He Ala Asp Arg Met Glu Gin Asn Leu Glu Leu Leu Ala Thr 
95 100 105 
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get gaa acc tgg gat aac ggc aaa ccc att cgc gaa acc agt get gcg 387 
Ala Glu Thr Trp Asp Asn Gly Lys Pro lie Arg Glu Thr Ser Ala Ala 
110 115 120 



gat gta ccg ctg gcg att gac cat ttc cgc tat ttc gcc teg tgt att 435 
5 Asp Val Pro Leu Ala lie Asp His Phe Arg Tyr Phe Ala Ser Cys He 
125 130 135 

egg gcg eag gaa ggt ggg ate agt gaa gtt gat age gaa ace gtg gee 483 
Arg Ala Gin Glu Gly Gly He Ser Glu Val Asp Ser Glu Thr Val Ala 
140 145 150 



10 tat eat ttc eat gaa eeg tta ggc gtg gtg ggg eag att ate ccg tgg 531 
Tyr His Phe His Glu Pro Leu Gly Val Val Gly Gin He He Pro Trp 
155 160 165 170 



aac ttc ccg ctg ctg atg gcg age tgg aaa atg get ccc gcg ctg gcg 579 
Asn Phe Pro Leu Leu Met Ala Ser Trp Lys Met Ala Pro Ala Leu Ala 
IS 175 180 185 



gcg ggc aac tgt gtg gtg ctg aaa eee gca egt ett acc ccg ctt tet 627 
Ala Gly Asn Cys Val Val Leu Lys Pro Ala Arg Leu Thr Pro Leu Ser 
190 195 200 



gta ctg ctg eta atg gaa att gtc ggt gat tta ctg ccg ccg ggc gtg 675 
20 Val Leu Leu Leu Met Glu He Val Gly Asp Leu Leu Pro Pro Gly Val 
205 210 215 

gtg aac gtg gtc aat ggc gca ggt ggg gta att ggc gaa tat ctg gcg 723 
Val Asn Val Val Asn Gly Ala Gly Gly Val He Gly Glu Tyr Leu Ala 
220 225 230 



25 ace teg aaa cgc ate gee aaa gtg gcg ttt ace ggc tea aeg gaa gtg 771 
Thr Ser Lys Arg He Ala Lys Val Ala Phe Thr Gly Ser Thr Glu Val 
235 240 245 250 



.22- 



wo 01/16346 A 0 PCT/USOO/23878 



819 



ggc caa caa att atg caa tac gca acg caa aac att att ccg gtg aog 
Gly Gin Gin He Met Gin Tyr Ala Thr Gin Asn He He Pro Val Thr 
255 260 265 



ctg gag ttg ggc ggt aag teg cca aat ate gte ttt get gat gtg atg 867 
5 Leu Glu Leu Gly Gly Lys Ser Pro Asn He Val Phe Ala Asp Val Met 
270 275 280 

gat gaa gaa gat gee ttt tte gat aaa gcg ctg gaa ggc ttt gca ctg 915 
Asp Glu Glu Asp Ala Phe Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu 
285 290 295 

10 ttt gcc ttt aac cag ggc gaa gtt tgc acc tgt ccg agt cgt get tta 963 
Phe Ala Phe Asn Gin Gly Glu Val Cys Thr Cys Pro Ser Arg Ala Leu 
300 305 310 



gtg cag gaa tot ate tac gaa cgc ttt atg gaa cgc gcc ate egc cgt 
Val Gin Glu Ser He Tyr Glu Arg Phe Met Glu Arg Ala He Arg Arg 
15 315 320 325 330 

gte gaa age att cgt age ggt aac ccg etc gae age gtg acg caa atg 
Val Glu Ser He Arg Ser Gly Asn Pro Leu Asp Ser Val Thr Gin Met 
335 340 345 

ggc gcg cag gtt tct cac ggg caa ctg gaa ace ate etc aac tac att 
20 Gly Ala Gin Val Ser His Gly Gin Leu Glu Thr He Leu Asn Tyr He 
350 355 360 

gat ate ggt aaa aaa gag ggc get gae gtg etc aca ggc ggg egg egc 
Asp He Gly Lys Lys Glu Gly Ala Asp Val Leu Thr Gly Gly Arg Arg 
365 370 375 

25 aag ctg ctg gaa ggt gaa ctg aaa gae ggc tac tac etc gaa ccg acg 
Lys Leu Leu Glu Gly Glu Leu Lys Asp Gly Tyr Tyr Leu Glu Pro Thr 
380 385 390 



1011 



1059 



1107 



1155 



1203 
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att ctg ttt ggt cag aac aat atg egg gtg ttc cag gag gag att ttt 1251 
lie Leu Phe Gly Gin Asn Asn Met Arg Val Phe Gin Glu Glu lie Phe 
395 400 405 410 

ggc ccg gtg ctg gcg gtg acc acc ttc aaa acg atg gaa gaa gcg ctg 1299 
5 Gly Pro Val Leu Ala Val Thr Thr Phe Lys Thr Met Glu Glu Ala Leu 
415 420 425 

gag ctg gcg aac gat acg caa tat ggc ctg ggc gcg ggc gtc tgg age 1347 
Glu Leu Ala Asn Asp Thr Gin Tyr Gly Leu Gly Ala Gly Val Trp Ser 
430 435 440 

10 cgc aac ggt aat ctg gcc tat aag atg ggg cgc ggc ata cag get ggg 1395 
Arg Asn Gly Asn Leu Ala Tyr Lys Met Gly Arg Gly lie Gin Ala Gly 
445 450 455 

cgc gtg tgg ace aac tgt tat cac get tac ccg gca cat gcg gcg ttt 1443 
Arg Val Trp Thr Asn Cys Tyr His Ala Tyr Pro Ala His Ala Ala Phe 
IS 460 465 470 

ggt ggc tac aaa caa tea ggt ate ggt cgc gaa acc cac aag atg atg 1491 
Gly Gly Tyr Lys Gin Ser Gly He Gly Arg Glu Thr His Lys Met Met 
475 480 485 490 

ctg gag cat tac cag caa acc aag tgc ctg ctg gtg age tac teg gat 1539 
20 Leu Glu His Tyr Gin Gin Thr Lys Cys Leu Leu Val Ser Tyr Ser Asp 
495 500 505 

aaa ccg ttg ggg ctg ttc taagagcteg aattege 1574 
Lys Pro Leu Gly Leu Phe 
510 
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<210> 8 
<211> 512 
<212> PRT 

<213> Escherichia coli 
5 <400> 8 

Met Thr Asn Asn Pro Pro Ser Ala Gin He Lys Pro Gly Glu Tyr Gly 
1 5 10 15 

Phe Pro Leu Lys Leu Lys Ala Arg Tyr Asp Asn Phe He Gly Gly Glu 
20 25 30 

10 Trp Val Ala Pro Ala Asp Gly Glu Tyr Tyr Gin Asn Leu Thr Pro Val 
35 40 45 

Thr Gly Gin Leu Leu Cys Glu Val Ala Ser Ser Gly Lys Arg Asp He 
50 55 60 

Asp Leu Ala Leu Asp Ala Ala His Lys Val Lys Asp Lys Trp Ala His 
15 65 70 75 80 

Thr Ser Val Gin Asp Arg Ala Ala He Leu Phe Lys He Ala Asp Arg 
85 90 95 

Met Glu Gin Asn Leu Glu Leu Leu Ala Thr Ala Glu Thr Trp Asp Asn 
100 105 110 

20 Gly Lys Pro He Arg Glu Thr Ser Ala Ala Asp Val Pro Leu Ala He 
115 120 125 

Asp His Phe Arg Tyr Phe Ala Ser Cys He Arg Ala Gin Glu Gly Gly 
130 135 140 

He Ser Glu Val Asp Ser Glu Thr Val Ala Tyr His Phe His Glu Pro 
25 145 150 155 160 
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Leu Gly Val Val Gly Gin lie He Pro Trp Asn Phe Pro Leu Leu Met: 
165 170 175 

Ala Ser Trp Lys Met Ala Pro Ala Leu Ala Ala Gly Asn Cys Val Val 
180 185 190 

S Leu Lys Pro Ala Arg Leu Thr Pro Leu Ser Val Leu Leu Leu Met Glu 
195 200 205 

He Val Gly Asp Leu Leu Pro Pro Gly Val Val Asn Val Val Asn Gly 
210 215 220 

Ala Gly Gly Val He Gly Glu Tyr Leu Ala Thr Ser Lys Arg lie Ala 
10 225 230 235 240 

Lys Val Ala Phe Thr Gly Ser Thr Glu Val Gly Gin Gin He Met Gin 
245 250 255 

Tyr Ala Thr Gin Asn He He Pro Val Thr Leu Glu Leu Gly Gly Lys 
260 265 270 

IS Ser Pro Asn He Val Phe Ala Asp Val Met Asp Glu Glu Asp Ala Phe 
275 280 285 

Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu Phe Ala Phe Asn Gin Gly 
290 295 300 

Glu Val Cys Thr Cys Pro Ser Arg Ala Leu Val Gin Glu Ser He Tyr 
20 305 310 315 320 

Glu Arg Phe Met Glu Arg Ala He Arg Arg Val Glu Ser He Arg Ser 
325 330 335 

Gly Asn Pro Leu Asp Ser Val Thr Gin Met Gly Ala Gin Val Ser His 
340 345 350 
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Gly Gin lieu Glu Thr lie Leu Asn Tyr He Asp He Gly Lys Lys Glu 
355 360 365 

Gly Ala Asp Val Leu Thr Gly Gly Arg Arg Lys Leu Leu Glu Gly Glu 
370 375 380 

5 Leu Lys Asp Gly Tyr Tyr Leu Glu Pro Thr He Leu Phe Gly Gin Asn 
385 390 395 400 

Asn Met Arg Val Phe Gin Glu Glu He Phe Gly Pro Val Leu Ala Val 
405 410 4X5 

Thr Thr Phe Lys Thr Met Glu Glu Ala Leu Glu Leu Ala Asn Asp Thr 
10 420 425 430 

Gin Tyr Gly Leu Gly Ala Gly Val Trp Ser Arg Asn Gly Asn Leu Ala 
435 440 445 

Tyr Lys Met Gly Arg Gly He Gin Ala Gly Arg Val Trp Thr Asn Cys 
450 455 460 

15 Tyr His Ala Tyr Pro Ala His Ala Ala Phe Gly Gly Tyr Lys Gin Ser 
465 470 475 480 

Gly He Gly Arg Glu Thr His Lys Met Met Leu Glu His Tyr Gin Gin 
485 490 495 

Thr Lys Cys Leu Leu Val Ser Tyr Ser Asp Lys Pro Leu Gly Leu Phe 
20 500 505 510 



<210> 9 
<211> 5267 
<212> DMA 

<213> Klebsiella pneumoniae 
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<220> 

<223> Location complement 300.. 2153 
<220> 

<223> Location complement 2166.. 2591 
S <220> 

<223> Locaton complement 2594.. 3034 
<220> 

<223> Location complement 2191.. 4858 



<400> 9 



10 


agcgctatat 


gcgttgatgc 


aatttctatg 


cgcacccgtt 


ctcggagcac 


tgtccgaccg 


60 




ctttggccgc 


cgcccagtcc 


tgctcgcttc 


gctacttgga 


gccactatcg 


actacgcgat 


120 




catggcgacc 


acacccgtcc 


tgtggatctc 


ccactgacca 


aagctggccc 


cggcgacccg 


180 




cagcgcagcg 


acgcagcccg 


cgccgaagaa 


aatgagcaat 


ccggtgccaa 


gaaactcggc 


240 




cacgcactgc 


ccggttaagg 


tagaagtctg 


gttcattatc 


ggcatcctga 


aatagcacgt 


300 


15 


taaagagaga 


ggctggcgcg 


agcgcccgtt 


taattcgcct 


gaccggccag 


tagcagcccg 


360 






cattgcgcgg 


cccttctgtt 




v« w W W V» V. 




420 




ccatagtgcg 


acaaggcttc 


cgtgataagc 


tgcgggatct 


caaagtccag 


cgatgagccg 


480 




cccaccagca 


ccacaaaggc 


gatatcgcga 


atggaaccgc 


cgggtgagac 


ctggcgcagc 


540 




gcgcgcaggc 


agttggtgac 


aaacactttc 


tctttcgcct 


gccggcgcac 


gagacgaatt 


600 


20 


ttttccagcg 


ggctggcgtt 


atcgatcggc 


accagttcgc 


cctccttgat 


gtacaccact 


660 




ttggcgaaca 


ccgccgggct gagggcttcc 


cgaaagaact 


ccaccgcgcc 


attctcgtga 


720 




cgaatactga 


acaggctttc 


cactttggcc 


agcgggtatt 


tttttatcgc 


ttccgccagc 


780 




gaaagatcct 


cgaggcccag 


ctcggtttta 


atcaacaggc 


tgaccatatt 


ccccgccccg 


840 




gcgagatgga 


ccgccgttat 


ctgcccctcc 


gcgttgacga 


tcgccgcatc 


cgtcgagccg 


900 


25 


gcgccgaggt 


cgaggatcgc 


cagcggcgcc 


gcacagccgg 


gagtggttaa 


cgccccggcg 


960 




atggccatgt 


tggcctccac 


gccgcccacc 


accacctcgg 


tctgcagtcg 


ggcgctcagt 


1020 




tcgcgggcga 


taacctgcat 


ttgcagacga 


tccgctttca 


ccatcgccgc 


catcccgacg 


1080 




gcattctcca 


tggcgcactc gccggccatc 


ccgccctgca 


ccttgcgcgg 


aataaacgta 


1140 




tccaccgcca 


gcagatcctg gatgtatatc 


gcgctcatct 


catggccggt 


cagggacgcc 


1200 


30 


attaccttgc 


gcacccgctc aagcatgccg 


ccggcgtggg 


tgcccggttc 


gccgcggatg 


1260 




tcgcgtaccg 


gagcgcaggc gctcatcgcc 


tgcatgatgg 


cttccgcgcc 


ctcggcgaca 


1320 




tcggcctctc 


cgcggcgctt 


ttcgccgcta 


atgtagaggt 


tgcccgccgg 


gatcacccgc 


1380 




gactgcacat 


ccccctgcgg ggtcttgagc 


accaccgcgg 


aacggttgcc 


aatcagggcg 


1440 
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cgggcgatgg ggacgatggc ctgggtctct tccgggctta gcccgaagaa ggtggcgatc 1500 
ccgtagggat tcgacaggat ccgcaccacc tggcccggcg cggccacttc caccgccgcc 1560 
attaccccct cggggacctg ctccagcagc gtcacttcat ccaccaccgg cagggtttta 1620 
cgcaggcggt tgttcaccag cacgccgtcg tcctttttga ggatcgccgc caccacgttg 1680 
5 atcccccggt cgagcgcctc attgagccac cacacggcgt caaggaaatc gacggcgtcg 1740 
tcaatcagta cgatccaccc ctcggcatac tgcgccgccg gcagcgtcgc cagccgcccg 1800 
agggcgatag tcgtccccac gccaacgccc accccgcccg gcgtctgcgg gttatgaccg 1860 
atcatggtcg attcggtgat aatggtctcg gtgatggtct ccatcgccac atcgccaatc 1920 
accggcgcgg cttcgttaag atagatgcga gagacatcgc tcatcgacca cggtgttttc 1980 
10 gccagggcct gctccagcgc ggcgagggtc ccggcgatat tgtcccgcgt ccctttcatg 2040 
cccgtcgtcg cgacgatccc gctggcaaca aacgccctcg cctgcgggta gtcggacgcc 2100 
agcgccacct cggtggtggc gttgccgata tcaatcccgg ctattaacgg cacgctgacc 2160 
tccgcttagc ttcctttacg cagcttatgc cgctgctgat acacttccgc cgactcccgg 2220 
acaaaggcgg cattcactgt cgcatgccag gtgtgctcca gctcgtcggc gatcgccagc 2280 
15 agctccgcct gcgaggagcg gaacgggcgc agcgcgttat agatagccag aatgcgctcg 2340 
tcaggaatgg cgataagctc cgccgcgcgg cggaaattgc gcgccaccgc atggcgctgc 2400 
atctgctcgg caatctgcgc ctggtactca agggtctggc gggagatccg cacatcctgc 2460 
gggcccacct cgccagagag caccttctcg agggtaatat cggtcaatgg tttgccggta 2520 
ggcgtcagga tatgctccgg gcagcgggtg gctaacggat aatcctgcac gcgcatggtt 2580 
20 ttctcgctca tggtcactcc cttactaagt cgatgtgcag ggtgacgggc tcggcgtcct 2640 
gcaccacatg tttggtctct ttgatatgaa atagcgcggc tttggccata aatttcggcc 2700 
gcaccatctg atcgttcacc accggcaccg gcgaaggtga ctctttgcgc gcatagcgcg 2760 
cagcgttttt gccaatctgc cggtaggtct ccagcgtcag cagcggcgcc tgggagaaca 2820 
gctccaggtt gctgagcggc agcagatcgc gctgatggat gaccgtggtc cccttcgact 2880 
25 ggataccgat gccgatcccc gagccgctca ggttggccgc atcccaggcc ataaaggaga 2940 
cgtcggacgt gcgcagaatg cgcaccaccc gggcgtgaag cccctcttct tccaccccgg 3000 
caatcagctc tttgaggatc gcgccatggg gcatatcgat cagagtgtga tgctggtgtt 3060 
tatcgaaggc agggccgacg ccgatcacca cttcatcggc gcgttcatcg gcagaagcta 3120 
ccccgccctc gcgggttttc agggtaaaag agggctgaat ttgggttgtc tgttgcacag 3180 
30 gaataccgcc ttattcaatg gtgtcgggct gaaccacgcc cggaatattt ttgatctccg 3240 
cccagcgttc ggcagagatg cgatagccgg tgcccggccc ctgatagtca ttgatgtcgt 3300 
tgaccgcact caccacctcg aactgccgat cgagaatggc cgaggtctgc aggtaatcgc 3360 
cggtgacccg ctggcgcagc atattgagaa tattgctggc gatatcctca aagccgctgc 3420 
ggctcagcgc gccgacaata tcgaggccgg tgatgttgcg cttcatcatc tcttccaccg 3480 
35 cactcagatc ctccaccacg ttacgcggcg gcatctcgtt gctgccgtgc gcgtaggtgg 3540 
cggcctccac ctcctcgtcg gcgattggcg gcagccccag ctcgcggaaa accgcctgga 3600 
tcgcccgcgc cgctttctgg cgaatggcaa tggtttccgc ctcggtcacc ggacgcaggc 3660 
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c^rcc^t caac 


cat c agg^ ca 


cgctgcagga 


tgttgtaatc 


atcaaaatct 


Cccgcatcga 3720 




agttcgagcc 


ggcgaacatg 


ttgtcgtagt 


tcggcaccgc 


gctgtagccg 


gagaaaataa 3780 




ag t cggt g cc 


cggcagcatc 


tgcatcaggg 


tgcgcgcggt 


gcggcgaata 


tccgagtggg 3840 




agaaag t c 


gtcgttggcg 


gacgccactt 


cgaggtcgag 


catagaggcg 


atcaggtttt 3900 


J 


ccgccagcac 


cgcccgaatg 


cccgacggca 


cagcgccggt 


catgccgata 


cagctcaccg 3960 




cgccgttttg 


cagtccctga 


accccggcgc 


ctttagtaat 


gaagatgcag 


cgcgattcga 4020 




ggtagagcat 


cgacttgctc 


tccgaatagc 


ccatcagcgc 


ttcggatccg 


gtgccggagg 4080 




tgtagcgcat 


tttcaacccg 


cgggaggcgt 


aggccgaggc 


gaggaacgcc 


tttgaccacg 4140 




gcgtatcatc 


gccgtcggta 


aataccgctt 


cggtgccgta 


gaccgacacc 


gtctcggcgt 4200 


1 A 


agctggttaa 


gccacgcatg 


cccagctcca 


gctcggtggc 


ctcttccacc 


gagcactgcg 4260 




tcaacacgcc 


ggggcggccg 


cactgcgaac 


cgaccaacag 


cgccagggcg 


ttaaacggcg 4320 




cgtagcgcgc 


gataccgacc 


gtggtctcct 


gttctgagaa 


gccgcggatc 


ccggcctcgg 4380 




cggcgtcagc 


ggcaatctgc 


accggattat 


ctttgagatt 


ggtgacgtgg 


cactggttgg 4440 




agggggtccg 


gcgggcacgc 


atcttctgca 


gcgccatcat 


catctccacc 


acgttcatct 4500 


Id 


gcgccatcac 


ctcgaccgct 


ttggccggcg 


tgatggcggt 


agtgatggca 


atgatctcct 4560 




cccggctgac 


gtgaatatcc 


accagcatac 


gggctatttc 


caccgcctcc 


^Srsrcgcattg 4620 




cctgctctgt 


gcgctcaacg 


ttgatcgcgt 


aatcggcgat 


aaatcggtcg 


atcatgtcaa 4680 




actggtcccg 


gcgtttgccg 


tccagttcga 


cgatcagacc 


gttgtccact 


tttactgaag 4740 




agaccgggtc 


aaaggggctg 


tccatggcga 


tcagcccctc 


ttcaggccac 


tcgccaatca 4800 


20 


gcccgtcctg 


attgacgggg 


cgctgggcca 


gtactgcaaa 


tcgttttgat 


cttttcattg 4860 




ttcatcggct 


caaaaggtga 


aatccgcaga 


cggtagcgaa 


tacgccgggc 


cagcgtcgtt 4920 




gccgcccggc 


cattaccggc 


aatagcggaa 


ctttaaatga 


gccagtggtg 


aaaaaaataa 4980 




atttaatttc 


gtttcaattt 


ggcacacgaa 


atctaccgac 


agtttcacta 


tgaaacttta 5040 




ctccggcggc 


aaaaataaaa 


aatgtgatcg 


cccgcaatga 


tataaatcaa 


ttaataaaaa 5100 


25 


acgcccttaa 


ttacgttttt 


ccgacgctat 


tttaacccta 


ttgactaaat 


catggcgggc 5160 




gacaaaataa 


cgctgacaaa 


aataaagcaa 


gccaaccgaa 


tggtaatagt 


tttttactat 5220 




cgccccctac 


tgactattcg 


cgccagcgtt 


atcctggtgc 


gggagaga 


5268 



<210> 10 

<211> 607 

30 <212> PRT 

<213> Klebsiella pneumoniae 

<400> 10 

Met Pro Leu lie Ala Gly He Asp He Gly Asn Ala Thr Thr Glu Val 
15 10 15 
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Ala Leu Ala Ser Asp Tyr Pro Gin Ala Arg Ala Phe Val Ala Ser Gly 
20 25 30 

lie Val Ala Thr Thr Gly Met Lys Gly Thr Arg Asp Asn He Ala Gly 
35 40 45 

5 Thr Leu Ala Ala Leu Glu Gin Ala Leu Ala Lys Thr Pro Trp Ser Met 
50 55 60 

Ser Asp Val Ser Arg lie Tyr Leu Asn Glu Ala Ala Pro Val He Gly 
65 70 75 80 

Asp Val Ala Met Glu Thr He Thr Glu Thr He He Thr Glu Ser Thr 
10 85 90 95 

Met He Gly His Asn Pro Gin Thr Pro Gly Gly Val Gly Val Gly Val 
100 105 110 

Gly Thr Thr He Ala Leu Gly Arg Leu Ala Thr Leu Pro Ala Ala Gin 
115 120 125 

IS Tyr Ala Glu Gly Trp He Val Leu He Asp Asp Ala Val Asp Phe Leu 
130 135 140 

Asp Ala Val Trp Trp Leu Asn Glu Ala Leu Asp Arg Gly He Asn Val 
145 150 155 160 

Val Ala Ala He Leu Lys Lys Asp Asp Gly Val Leu Val Asn Asn Arg 
20 165 170 175 

Leu Arg Lys Thr Leu Pro Val Val Asp Glu Val Thr Leu Leu Glu Gin 
160 185 190 

Val Pro Glu Gly Val Met Ala Ala Val Glu Val Ala Ala Pro Gly Gin 
195 200 205 
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Val Val Arg lie Leu Ser Asn Pro Tyr Gly lie Ala Thr Phe Phe Gly 
210 215 220 



Leu Ser Pro Glu Glu Thr Gin Ala He Val Pro He Ala Arg Ala Leu 
225 230 235 240 

S He Gly Asn Arg Ser Ala Val Val Leu Lys Thr Pro Gin Gly Asp Val 
245 250 255 

Gin Ser Arg Val He Pro Ala Gly Asn Leu Tyr He Ser Gly Glu Lys 
260 265 270 

Arg Arg Gly Glu Ala Asp Val Ala Glu Gly Ala Glu Ala He Met Gin 
10 275 280 285 

Ala Met Ser Ala Cys Ala Pro Val Arg Asp He Arg Gly Glu Pro Gly 
290 295 300 

Thr His Ala Gly Gly Met Leu Glu Arg Val Arg Lys Val Met Ala Ser 
305 310 315 320 

IS Leu Thr Gly His Glu Met Ser Ala He Tyr He Gin Asp Leu Leu Ala 
325 330 335 

Val Asp Thr Phe He Pro Arg Lys Val Gin Gly Gly Met Ala Gly Glu 
340 345 350 

Cys Ala Met Glu Asn Ala Val Gly Met Ala Ala Met Val Lys Ala Asp 
20 355 360 365 

Arg Leu Gin Met Gin Val He Ala Arg Glu Leu Ser Ala Arg Leu Gin 
370 375 380 

Thr Glu Val Val Val Gly Gly Val Glu Ala Asn Met Ala He Ala Gly 
385 390 395 400 
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Ala Leu Thr Thr Pro Gly Cys Ala Ala Pro Leu Ala lie Leu Asp Leu 

405 410 415 



Gly Ala Gly Ser Thr Asp Ala Ala He Val Asn Ala Glu Gly Gin He 
420 425 430 

5 Thr Ala Val His Leu Ala Gly Ala Gly Asn Met Val Ser Leu Leu He 
435 440 445 

Lys Thr Glu Leu Gly Leu Glu Asp Leu Ser Leu Ala Glu Ala He Lys 
450 455 460 

Lys Tyr Pro Leu Ala Lys Val Glu Ser Leu Phe Ser He Arg His Glu 
10 465 470 475 480 

Asn Gly Ala Val Glu Phe Phe Arg Glu Ala Leu Ser Pro Ala Val Phe 
485 490 495 

Ala Lys Val Val Tyr He Lys Glu Gly Glu Leu Val Pro He Asp Asn 
500 505 510 

15 Ala Ser Pro Leu Glu Lys He Arg Leu Val Arg Arg Gin Ala Lys Glu 
515 520 525 

Lys Val Phe Val Thr Asn Cys Leu Arg Ala Leu Arg Gin Val Ser Pro 
530 535 540 

Gly Gly Ser He Arg Asp He Ala Phe Val Val Leu Val Gly Gly Ser 
20 545 550 555 560 

Ser Leu Asp Phe Glu He Pro Gin Leu He Thr Glu Ala Leu Ser His 
565 570 575 

Tyr Gly Val Val Ala Gly Gin Gly Asn He Arg Gly Thr Glu Gly Pro 
5B0 585 590 
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Arg Asn Ala Val Ala Thr Gly Leu Leu Leu Ala Gly Gin Ala Asn 
595 600 605 



<210> 11 
<211> 141 
5 <212> PRT 

<213> Klebsiella pneumoniae 

<400> 11 

Met Ser Glu Lys Thr Met Arg Val Gin Asp Tyr Pro Leu Ala Thr Arg 
15 10 15 

10 Cys Pro Glu His lie Leu Thr Pro Thr Gly Lys Pro Leu Thr Asp lie 
20 25 30 

Thr Leu Glu Lys Val Leu Ser Gly Glu Val Gly Pro Gin Asp Val Arg 
35 40 45 

He Ser Arg Gin Thr Leu Glu Tyr Gin Ala Gin He Ala Glu Gin Met 
IS 50 55 60 

Gin Arg His Ala Val Ala Arg Asn Phe Arg Arg Ala Ala Glu Leu He 
65 70 75 60 

Ala lie Pro Asp Glu Arg He Leu Ala He Tyr Asn Ala Leu Arg Pro 
85 90 95 

20 Phe Arg Ser Ser Gin Ala Glu Leu Leu Ala He Ala Asp Glu Leu Glu 
100 105 110 

His Thr Trp His Ala Thr Val Asn Ala Ala Phe Val Arg Glu Ser Ala 
115 120 125 

Glu Val Tyr Gin Gin Arg His Lys Leu Arg Lys Gly Ser 
25 130 135 140 
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<210> 12 
<211> 146 
<212> PRT 

<213> Klebsiella pneumoniae 
5 <400> 12 

Met Pro His Gly Ala lie Leu Lys Glu I-eu lie Ala Gly Val Glu Glu 
15 10 15 

Glu Gly Leu His Ala Arg Val Val Arg lie Leu Arg Thr Ser Asp Val 
20 25 30 

10 Ser Phe Met Ala Trp Asp Ala Ala Asn Leu Ser Gly Ser Gly lie Gly 
35 40 45 

He Gly lie Gin Ser Lys Gly Thr Thr Val He His Gin Arg Asp Leu 
50 55 60 

Leu Pro Leu Ser Asn Leu Glu Leu Phe Ser Gin Ala Pro Leu Leu Thr 
15 65 70 75 80 

Leu Glu Thr Tyr Arg Gin He Gly Lys Asn Ala Ala Arg Tyr Ala Arg 
85 90 95 

Lys Glu Ser Pro Ser Pro Val Pro Val Val Asn Asp Gin Met Val Arg 
100 105 110 

20 Pro Lys Phe Met Ala Lys Ala Ala Leu Phe His He Lys Glu Thr Lys 
115 120 125 

His Val Val Gin Asp Ala Glu Pro Val Thr Leu His He Asp Leu Val 
130 135 140 

Arg Glu 
25 145 
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<210> 13 
<211> 555 
<212> PRT 

<213> Klebsiella pneumoniae 
5 <400> 13 

Met Lys Arg Ser Lys Arg Phe Ala Val Leu Ala Gin Arg Pro Val Asn 
1 5 10 . 15 

Gin Asp Gly Leu lie Gly Glu Trp Fro Glu Glu Gly hen lie Ala Met 
20 25 30 

10 Asp Ser Pro Phe Asp Pro Val Ser Ser Val Lys Val Asp Asn Gly Leu 
35 40 45 

lie Val Glu Leu Asp Gly Lys Arg Arg Asp Gin Phe Asp Met lie Asp 
50 55 60 

Arg Phe lie Ala Asp Tyr Ala He Asn Val Glu Arg Thr Glu Gin Ala 
15 65 70 75 80 

Met Arg Leu Glu Ala Val Glu He Ala Arg Met Leu Val Asp He His 
85 90 95 

Val Ser Arg Glu Glu He He Ala He Thr Thr Ala He Thr Pro Ala 
100 105 110 

20 Lys Ala Val Glu Val Met Ala Gin Met Asn Val Val Glu Met Met Met 
115 120 125 

Ala Leu Gin Lys Met Arg Ala Arg Arg Thr Pro Ser Asn Gin Cys His 
130 135 140 

Val Thr Asn Leu Lys Asp Asn Pro Val Gin He Ala Ala Asp Ala Ala 
25 145 150 155 160 
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Glu Ala Gly He Arg Gly Phe Ser Glu Gin Glu Thr Thr Val Gly He 
165 170 175 

Ala Arg Tyr Ala Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gin 
180 185 190 

5 Cys Gly Arg Pro Gly Val Leu Thr Gin Cys Ser Val Glu Glu Ala Thr 
195 200 205 

Glu Leu Glu Leu Gly Met Arg Gly Leu Thr Ser Tyr Ala Glu Thr Val 
210 215 220 

Ser Val Tyr Gly Thr Glu Ala Val Phe Thr Asp Gly Asp Asp Thr Pro 
10 225 230 235 240 

Trp Ser Lys Ala Phe Leu Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 
245 250 255 

Met Arg Tyr Thr Ser Gly Thr Gly Ser Glu Ala Leu Met Gly Tyr Ser 
260 265 270 

15 Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys He Phe He Thr 
275 280 285 

Lys Gly Ala Gly Val Gin Gly Leu Gin Asn Gly Ala Val Ser Cys He 
290 295 300 

Gly Met Thr Gly Ala Val Pro Ser Gly He Arg Ala Val Leu Ala Glu 
20 305 310 315 320 

Asn Leu He Ala Ser Met Leu Asp Leu Glu Val Ala Ser Ala Asn Asp 
325 330 335 

Gin Thr Phe Ser His Ser Asp He Arg Arg Thr Ala Arg Thr Leu Met 
340 345 350 
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Gin Met Leu Pro Gly Thr Asp Phe lie Phe Ser Gly Tyr Ser Ala Val 
355 360 365 



Pro Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Phe Asp Ala Glu Asp 
370 375 380 

5 Phe Asp Asp Tyr Asn He Leu Gin Arg Asp Leu Met Val Asp Gly Gly 
385 390 395 400 

Leu Arg Pro Val Thr Glu Ala Glu Thr He Ala He Arg Gin Lys Ala 
405 410 415 

Ala Arg Ala He Gin Ala Val Phe Arg Glu Leu Gly Leu Pro Pro He 
10 420 425 430 

Ala Asp Glu Glu Val Glu Ala Ala Thr Tyr Ala His Gly Ser Asn Glu 
435 440 445 

Met Pro Pro Arg Asn Val Val Glu Asp Leu Ser Ala Val Glu Glu Met 
450 455 460 

IS Met Lys Arg Asn He Thr Gly Leu Asp He Val Gly Ala Leu Ser Arg 
465 470 475 480 

Ser Gly Phe Glu Asp He Ala Ser Asn He Leu Asn Met Leu Arg Gin 
485 490 495 

Arg Val Thr Gly Asp Tyr Leu Gin Thr Ser Ala He Leu Asp Arg Gin 
20 500 505 510 

Phe Glu Val Val Ser Ala Val Asn Asp He Asn Asp Tyr Gin Gly Pro 
515 520 525 

Gly Thr Gly Tyr Arg He Ser Ala Glu Arg Trp Ala Glu He Lys Asn 
530 535 540 
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<210> 14 
<211> 56 
5 <212> DNA 

<213> Escherichia coli 

<400> 14 

gctaccatgg cttaaccggt accaaggaga tatcatatgt cagtacccgt tcaaca 



<210> IS 
10 <211> 59 
<212> DNA 

<213> Escherichia coli 
<400> 15 

gcctcgagtc tagagccgtc gacgggaatt cgagctctta agactgtaaa taaaccacc 

IS <210> 16 
<211> 46 

<2i2> um 

<213> SaccharoTciycea cerevisiae 
<400> 16 

20 gcggtaccaa ggaggtatca tatgttcagt agatctacgc tctgct 

<210> 17 
<211> 33 
<212> DHA 

<213> Saccharomyces cerevisiae 
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<400> 17 



gcgaattcga 



gctcttactc gtccaatttg gcac 



34 



<210> 18 
<211> 35 
5 <212> DHA 

<213> Homo sapiens 

<400> 18 

gcggtaccaa ggaggtatca tatgtcagcc gccgc 35 

<210> 19 
10 <211> 39 
<212> DNA 
<213> Homo sapiens 



IS <210> 20 
<211> 44 
<212> DNA 

<213> Escherichia coli 
<400> 20 

20 gcggtaccaa ggaggtatca tatgaccaat aaCccccctt cage 44 

<210> 21 
<211> 38 
<212> DNA 

<213> Escherichia coli 



<400> 19 



gcgaattcga 



gctcttatga gttcttctga ggcactttg 



39 
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<400> 21 

gcgaattcga gctcttagaa cagccccaac ggtttatc 

<210> 22 
<211> 20 
5 <212> DNA 

<213> Escherichia coli 

<400> 22 

atcccgccgt taaccaccat 

<2X0> 23 
10 <211> 34 
<212> DKA 

<213> Escherichia coli 
<400> 23 

gcggtaccat tgttatccgc tcacaattcc acac 
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